Skip to content

Optimising Search


Only use one index per Marqo cluster

For production use cases we recommend only a single Marqo index per Marqo cluster. This results in more predictable resource usage and consistent search performance.

Be selective about tensor fields

Tensor embeddings are used to power Marqo's tensor search. It is possible to index any string fields as tensor fields.

There is a natural tradeoff, however, between tensor search and storage size. For certain fields, it may not be worth using tensor search, and thus, storing full embeddings for each field and document. For example, categorical fields such as a song's genre or a book's category may be represented as a string, but are mainly useful in keyword/lexical search, or as conditions in pre-filtering tensor search.

Marqo provides the ability to tune this tradeoff. When adding documents, only fields explicitly added to the tensor_fields parameter are indexed for tensor search. This selective indexing allows for a balance between the benefits of tensor search and storage size efficiency.

The best practice is to only select fields that will benefit from semantic and multi-modal search as tensor fields.

For example:

import marqo

mq = marqo.Client(url="http://localhost:8882")

mq.create_index("my-first-index")

mq.index("my-first-index").add_documents(
    [
        {
            "Title": "The Travels of Marco Polo",
            "Description": "A 13th-century travelogue describing Polo's travels",
            "Genre": "History",
        },
        {
            "Title": "Extravehicular Mobility Unit (EMU)",
            "Description": "The EMU is a spacesuit that provides environmental protection, "
            "mobility, life support, and communications for astronauts",
            "Genre": "Science",
        },
    ],
    tensor_fields=["Description"],
)
The above example will not store tensors against the "Genre" field, but we can still use it, for example:
## Search all of a specific genre
result = mq.index("my-first-index").search("History", search_method="LEXICAL")

## Filter out search results
results = mq.index("my-first-index").search(
    q="spacesuits", filter_string="Genre:Science"
)