Skip to content

Reranking models

Reranking models are ones that are applied after an initial set of results are retrieved. The reranking model takes in the list of results and reranks them according to additional criteria or constraints.


Image reranking works across fields that contain image pointers or references. They do not work across text fields.


OWL-ViT is a reranking model that allows localisation and can be used with both tensor and lexical search.


OWL-ViT is a Vision Transformer for Open-World Localization. It is an object detector that can be used to localise objects within an image. OWL-ViT differs from other object detectors in that it is: 1) open vocabulary and can detect any class, and 2) performs conditional localisation by taking context into account when determining the location of the object. This permits its use as a reranking model by using the query as the context to condition the localisation on. When used as a reranker, the query and candidate images are fed into OWL-ViT, proposals for the context are generated and the final ranking is determined from the highest scoring proposal per image. In addition, the location (x1, y1, x2, y2) of the highest scoring proposal is returned as a highlight in the _highlights field of the response. The coordinates are on the scale of the original image and the coordinate system is the same that is used in PIL - the top-left corner is represented by (0,0).


# Create a structured index
import marqo

settings = {
    "type": "structured",
    "vectorNumericType": "float",
    "model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
    "normalizeEmbeddings": True,
    "textPreprocessing": {
        "splitLength": 2,
        "splitOverlap": 0,
        "splitMethod": "sentence",
    "imagePreprocessing": {"patchMethod": None},
    "allFields": [
        {"name": "Title", "type": "text", "features": ["lexical_search"]},
        {"name": "Description", "type": "text", "features": ["lexical_search"]},
        {"name": "image_location", "type": "image_pointer"},
    "tensorFields": ["image_location", "Description"],
    "annParameters": {
        "spaceType": "prenormalized-angular",
        "parameters": {"efConstruction": 512, "m": 16},

mq = marqo.Client(url="http://localhost:8882")

mq.create_index("my-test-structured-index", settings_dict=settings)

If we add the following documents to the index:

            "Title": "The Travels of Marco Polo",
            "Description": "A 13th-century travelogue describing Polo's travels",
            "image_location": "",
            "Title": "Extravehicular Mobility Unit (EMU)",
            "Description": "The EMU is a spacesuit that provides environmental protection",
            "image_location": "",
            "_id": "article_591",

then a search invoking OWL-ViT as a reranker is performed by passing through the model name to reranker. searchable_attributes should be specified explicitly with the image field to rerank over appearing first in the list. For example,

response = mq.index("my-test-structured-index").search(
    "space suit",


The current limitations of OWL-ViT are that it requires an image field to rerank over and currently only a text query can be used. Support for image based queries will be added shortly.

Available models


Coming soon!