Skip to content

Document field types

Strings

These are vectorised, unless the field is specified is in non_tensor_fields during index time.

Floats

These aren't vectorised, but can be used to filter search results.

Bools

These aren't vectorised, but can be used to filter search results.

Ints

These aren't vectorised, but can be used to filter search results.

Array

Currently, only arrays of strings are supported.

Array fields must be given as a non_tensor_fields during index time, else an error will be thrown.

This type of field can be used to filter search results and for lexical search.

Example

# index an array field called "my tags", making sure it is in non_tensor_fields 
mq.index("my_index").add_documents(documents=[
    {"Title": "Cool summer t-shirt", "_id": "1234", 'my tags': ['summer', 'yellow']}], 
    non_tensor_fields=['my tags']
)

# do a search request that filters based on the tags
mq.index("my_index").search(
    q="Something to wear in warm weather",
    filter_string="(my\ tags:yellow) AND (my\ tags:summer)"
)

Multimodal combination object

The multimodal combination object works with mappings. This field can consist of multiple child fields. The contents of these child fields will be vectorized and combined into a single tensor using a weighted-sum approach. The weights are specified in mappings. Each child field must have an assigned weight.

The combined tensor will be used for tensor search. The multimodal combination field can not be in non_tensor_fields.

Child fields can be used for lexical search or tensor search with filtering. All the child fields and child fields content must be str.

Example

# Create an index with "ViT-B/32" that can vectorise both text and images. 
settings = {
    "treat_urls_and_pointers_as_images":True,
    "model": "ViT-B/32",
}

mq.create_index("my_index", **settings)
# We add the document into our index
mq.index("my_index").add_documents(
    documents=[
        {
            "Title": "my document",
            "combo_text_image": {
                "my_text_attribute_1": "A rider is riding a horse jumping over the barrier.",
                "my_image_attribute_1": "https://raw.githubusercontent.com/marqo-ai/marqo/mainline/examples/ImageSearchGuide/data/image0.jpg",
            },
            "_id": "111",
        }, ],
    mappings={"combo_text_image": {"type": "multimodal_combination",
                                   "weights": {"my_text_attribute_1": 0.5,
                                               "my_image_attribute_1": 0.5,
                                               }}}
)

# tensor search
res = mq.index("my_index").search(q = "A rider is riding a horse jumping over the barrier.")
# lexical search
res = mq.index("my_index").search(q = "A rider is riding a horse jumping over the barrier.", search_method="lexical")
# filter search
res = mq.index("my_index").search(q = "A rider is riding a horse jumping over the barrier.",
                                                  filter_string="combo_text_image.my_text_attribute_1: (A rider is riding a horse jumping over the barrier.)")

Results

For all three search, we have the search results as:

{"hits": [{"Title": "my document",
   "combo_text_image": {"my_image_attribute_1": "https://raw.githubusercontent.com/marqo-ai/marqo/mainline/examples/ImageSearchGuide/data/image0.jpg",
    "my_text_attribute_1": "A rider is riding a horse jumping over the barrier."},
   "_id": "111",
   "_highlights": {"combo_text_image": "{'my_text_attribute_1': 'A rider is riding a horse jumping over the barrier.", "my_image_attribute_1": "https://raw.githubusercontent.com/marqo-ai/marqo/mainline/examples/ImageSearchGuide/data/image0.jpg'}"},
   "_score": 0.879635}],
 "query": "A rider is riding a horse jumping over the barrier.",
 "limit": 10,
 "offset": 0,
 "processingTimeMs": 111}