Embedding Static Context

Some use cases require frequent re-use of embeddings in queries. For example terms that influence style or quality as outlined in the multi-term queries section. In these cases it can be beneficial to pre-compute the embeddings. This allows for faster retrieval and avoids the need to recompute the embeddings for each query.

Using the Embed Endpoint and Context

The embed endpoint can be used to generate embeddings from any content and get the vectors back from Marqo. This can be used to generate embeddings for static context that can be used in queries.

import marqo

mq = marqo.Client()

# Generate an embedding for a static context
context_embeddings = mq.index("my-first-index").embed(
    ["Low quality, bad, jpeg artifacts"]
)

The context_embeddings can then be used in queries to influence the results. For example, to search for items that are not low quality:

results = mq.index("my-first-index").search(
    q={"t-shirt": 1.0},
    context={
        "tensor": [{"vector": context_embeddings["embeddings"][0], "weight": -0.5}]
    },
)

Doing this means that inference is only done once for the context vector, resulting in faster search.

Implementation Notes

In practice you will most likely want to cache these vectors in a separate persistent store, fast KV databases with disk storage for presistance are a good choice. Other use cases for this pattern includes applications like embedding a taxonomy of style descriptions for re-use in future searches.