Skip to content

Search

POST /indexes/{index_name}/search
Search for documents matching a specific query in the given index.

Path parameters

Name Type Description
index_name String name of the requested index

Body

Search Parameter Type Default value Description
q String OR Dict "" Query string, or weighted query strings (if Dict)
limit Integer 20 Maximum number of document chunks to be returned
offset Integer 0 Number of documents to skip (used for pagination)
filter String null Filter string in the Marqo DSL Language
searchableAttributes Array of strings ["*"] Attributes to display in the returned documents
showHighlights Boolean true Return highlights for the document match
searchMethod String "TENSOR" The search method, can be LEXICAL or TENSOR
attributesToRetrieve Array of strings ["*"] Attributes to return in the search response
reRanker String null Method to use for reranking results
boost Dict null Dictionary of attribute (string): 2-Array [weight (float), bias (float)]
image_download_headers Dict {} Headers for the image download. Can be used to authenticate the images for download.
context Dict null Dictionary of "tensor":{List[{"vector": List[floats], "weight": (float)}]} to bring your own vectors into search.
scoreModifiers Dict null A dictionary to modify the score based on field values. Check here for examples.
modelAuth Dict null Authorisation details used by Marqo to download non-publicly available models. Check here for examples.

Query parameters

Search Parameter Type Default value Description
device String cpu The device used to search. This allows you to use cuda GPUs to speed up indexing, if available. Options include cpu and cuda, cuda1, cuda2 etc. The cuda option tells Marqo to use all available cuda devices.

Search Result Pagination

Use parameters limit and offset to paginate your results, meaning to query a certain number of results at a time instead of all at once.

The limit parameter sets the size of a page. If you set limit to 10, Marqo's response will contain a maximum of 10 search results. The offset parameter skips a number of search results. If you set offset to 20, Marqo's response will skip the first 20 search results.

Let's say you want each page to have 10 results, and you want to receive the 2nd page. Try setting limit and offset like so:

# Specify page properties
page_size = 10
page_num = 2

# Set limit and offset accordingly
limit = page_size
offset = (page_num - 1) * page_size

Pagination limitations

Tensor search inconsistencies

Using pagination with search_method="TENSOR" may result in some results being skipped or duplicated (often near the edge of pages) within the first few pages if the page size is much smaller than the total search result count. Please keep this in mind when looking for particular results or when result order is essential.

Single-field pagination only

Currently, pagination is only supported for searches in 1 field (when searchable_attributes is set to a single field). Setting offset for multi-field searches will result in an error. Additionally, search results can only be 10,000 results deep. This means limit + offset must be less than or equal to 10000.

Lexical search: exact matches

Use searchMethod="LEXICAL" to perform keyword search instead of tensor search. With lexical search, you can enable exact match searching using double quotes: "".

Any term enclosed in "" will be labeled a required term, which must exist in at least one field of every result hit. Note that terms enclosed in double quotes must also have a space between them and the terms before and after them, same as regular terms. Use this feature to filter your results to only documents containing certain terms. For example, if you want to search for results containing fruits, vegetables, or candy, but they must be green, you can construct your query as such:

mq.index("my-first-index").search(
    q='fruit vegetable candy "green"',
    search_method="LEXICAL"
)

If you want to escape the double quotes (interpret them as text), use 2 escape keys \\. For example: q = 'Dwayne \\"The Rock\\" Johnson'.

Note: syntax errors

If your use of "" does not follow proper syntax, the entire query will simply be interpreted literally, with no required terms. Here some examples of syntax errors:

# Quoted terms without spaces before/after
q = 'apples"oranges" bananas'
q = 'cucumbers "melons and watermelons""grapefruit"'

# Unescaped quotes
q = 'There is a quote right"here'

# Unbalanced quotes
q = '"Dr. Seuss" "Thing 1" "Thing 2'

Response

Name Type Description
hits Array of objects Results of the query
limit Integer Number of documents chunks specified in the query
offset Integer Number of skipped results specified in the query
processingTimeMs Number Processing time of the query
query String Query originating the response

Example

curl -XPOST 'http://localhost:8882/indexes/my-first-index/search' -H 'Content-type:application/json' -d '
{
    "q": "what is the best outfit to wear on the moon?",
    "searchableAttributes": ["Description"],
    "limit": 10,
    "offset": 0,
    "showHighlights": true,
    "filter": "*:*",
    "searchMethod": "TENSOR",
    "attributesToRetrieve": ["Title", "Description"]
}'
mq.index("my-first-index").search(
    q="What is the best outfit to wear on the moon?",
    searchable_attributes=["Description"],
    limit=10,
    offset=0,
    show_highlights=True,
    filter_string="*:*",
    search_method=marqo.SearchMethods.LEXICAL,
    attributes_to_retrieve=["Title", "Description"]
)

Response: 200 Ok

{
  "hits": [
    {
      "Title": "Extravehicular Mobility Unit (EMU)",
      "Description": "The EMU is a spacesuit that provides environmental protection, mobility, life support, and communications for astronauts",
      "_highlights": {
        "Description": "The EMU is a spacesuit that provides environmental protection, mobility, life support, and communications for astronauts"
      },
      "_id": "article_591",
      "_score": 1.2387788
    },
    {
      "Title": "The Travels of Marco Polo",
      "Description": "A 13th-century travelogue describing Polo's travels",
      "_highlights": {"Title": "The Travels of Marco Polo"},
      "_id": "e00d1a8d-894c-41a1-8e3b-d8b2a8fce12a",
      "_score": 1.2047464
    }
  ],
  "limit": 10,
  "offset": 0,
  "processingTimeMs": 49,
  "query": "What is the best outfit to wear on the moon?"
}

Query (q)

Parameter: q

Expected value: Search string, or a dictionary of weighted search strings (with the structure :). Search strings are text or a url to an image, if the index has treat_urls_and_pointers_as_images set to True.

If queries are weighted, each weight act as a (possibly negative) multiplier for that query, relative to the other queries.

Default value: null

Examples

# query string: 
q = "How do I keep my plant alive?"

# a dictionary of weighted query strings
q = {
    # a weighting of 1 gives this query a neutral effect:
    "Which dogs are the best pets": 1.0,
    # we give this a weighting of 2 because we really want results similar to this:
    "https://image_of_a_golden_retriever.png": 2.0,
    # we give this a negative weighting to make it less likely to appear: 
    "Poodle": -1
}

Limit

Parameter: limit

Expected value: Any positive integer

Default value: 20

Sets the maximum number of documents returned by a single query.

Offset

Parameter: offset

Expected value: Any integer greater than or equal to 0

Default value: 0

Sets the number of documents to skip. For example, if offset = 20, The first result returned will be the 21st result. Only set this parameter for single-field searches (multi-field support to follow).

Filter

Parameter: filter

Expected value: A filter string written in Marqo's query DSL.

Default value: null

Uses filter expressions to refine search results.

Read our guide on filtering, faceted search and filter expressions.

Example

You can write a filter expression in string syntax using logical connectives (see filtering in Marqo):

"(type:confectionary AND food:(ice cream)) OR animal:hippo"

Searchable attributes

Parameter: searchableAttributes

Expected value: An array strings

Default value: ["*"]

Configures which attributes will be searched for query matches.

If no value is specified, searchableAttributes will be set to the wildcard and search all fields.

Example

You can write the searchableAttributes as a list of strings, for example if you only wanted to search the "Description" field of your documents:

["Description"]

Reranker

Parameter: reRanker

Expected value: One of "owl/ViT-B/32", "owl/ViT-B/16", "owl/ViT-L/14"

Default value: null

Selects the method for reranking results. See the Models reference reranking section for more details.

If no value is specified, reRanker will be set to null and no reranking will occur.

Example

You can write reRanker as a string, for example:

"owl/ViT-B/32"

Boost

Parameter: boost

Expected value: Dictionary of attribute (string): 2-Array [weight (float), bias (float)]

Default value: null

Boosting can increase or decrease the relevancy of field during search. Within Marqo, the scores from that field are multiplied by the weight, and are summed with the bias. Boosting is only available for Tensor Search

Example

my_index.search(
    "Chocolate chip cookies",
    boost={
        # we want to decrease the relevancy of the "Title" field
        "Title": [-1, -0.5],
        # we want to increase the relevancy of the "Image" field
        "Image": [2.5, 0]
    }
)

Context

Parameter: context

Expected value: Dictionary of "tensor":{List[{"vector": List[floats], "weight": (float)}]}

Default value: null

Context allows you to use your own vectors as context for your queries. Your vectors will be incorporated into the query using a weighted sum approach, allowing you to reduce the number of inference requests for duplicated content. The dimension of the provided vectors should be consistent with the index dimension.

Example

my_index.search(
    {"Chocolate chip cookies" :1},
    # the dimension of the vector (which is 768 here) should match the dimension of the index
    context = {"tensor": [{"vector": [0.3,] * 768, "weight" : 2}, # custom vector 1
                          {"vector": [0.12,] * 768, "weight" : -1},] # custom vector 2
    }
)

Score modifiers

Parameter: score_modifiers

Expected value: Check the following examples.

# A score modifiers with 2 multiply fields and 2 add fields with provided weights.
# You can add more fields if needed.
{
"multiply_score_by":
    [{"field_name": "my_multiply_field_1","weight": 1,},
     {"field_name": "my_multiply_field_2", "weight" : 2}], 
"add_to_score": 
    [{"field_name": "my_add_field_1", "weight" : 3,},
    {"field_name": "my_dd_field_2", "weight": 4,}] 
 }

# You can skip the "weight" and it will be set to 1 by default. 
{
"multiply_score_by":
    [{"field_name": "my_multiply_field_1"},
     {"field_name": "my_multiply_field_2"}], 
"add_to_score": 
    [{"field_name": "my_add_field_1", "weight" : 3,},
    {"field_name": "my_dd_field_2", "weight": 4,}] 
 }

# You can provide only "multiply_score_by" fields or "add_to_score" fields
{
"multiply_score_by":
    [{"field_name": "my_multiply_field_1","weight": 1,},
     {"field_name": "my_multiply_field_2", "weight" : 2}], 
 }

{
"add_to_score": 
    [{"field_name": "my_add_field_1", "weight" : 3,},
    {"field_name": "my_dd_field_2", "weight": 4,}] 
 }

Default value: null

Score modifiers are a useful feature that allows you to adjust the search score of a document based on its field values. In addition to the relevance score, you can further modify the score by specifying fields to consider for the modification.

The process involves two steps. First, the original score is multiplied by the field values and corresponding weights specified in the multiply_score_by parameter. If the specified fields are present in the document body, their values will be used to adjust the original score.

Secondly, the adjusted field values are multiplied by their corresponding weights and added to the score obtained from the previous step. Marqo then takes the maximum score between the ultimate score and 0 to avoid negative scores.

Overall, score modifiers can be a powerful tool to fine-tune the search results and provide more accurate and relevant information to the user.

Example

my_index.search(
    q = "Chocolate chip cookies",
    score_modifiers = {
        "multiply_score_by":
            [{"field_name": "my_multiply_filed_1","weight": 1},
             {"field_name": "my_multiply_filed_2","weight": 2}],
        "add_to_score": 
            [{"field_name": "my_add_field_1", "weight" : -1},
            {"field_name": "my_dd_field_2", "weight": 1}]
    }
)

Model Auth

Parameter: modelAuth

Expected value: Dictionary with either an s3 or an hf model store authorisation object.

Default value: null

The ModelAuth object allows searching on indexes that use OpenCLIP and CLIP models from private Hugging Face and AWS S3 stores.

The modelAuth object contains either an s3 or an hf model store authorisation object. The model store authorisation object contains credentials needed to access the index's non publicly accessible model. See the example for details.

The index's settings must specify the non publicly accessible model's location in the setting's model_properties object.

ModelAuth is used to initially download the model. After downloading, Marqo caches the model so that it doesn't need to be redownloaded.

Example: AWS s3

# Create an index that specifies the non-public location of the model.
# Note the `auth_required` field in `model_properties` which tells Marqo to use
# the modelAuth it finds during search to download the model
mq.create_index(
    index_name="my-cool-index", 
    settings_dict={
        "index_defaults": {
            "treat_urls_and_pointers_as_images": True,
            "model": 'my_s3_model',
            "normalize_embeddings": True,
            "model_properties": {
                {
                    "name": "ViT-B/32",
                    "dimensions": 512,
                    "model_location": {
                        "s3": {
                            "Bucket": "<SOME BUCKET>",
                            "Key": "<KEY TO IDENTIFY MODEL>",
                        },
                        "auth_required": True
                    },
                    "type": "open_clip",
                }
            }
        }
    }
)

# Specify the authorisation needed to access the private model during search:
# We recommend setting up the credential's AWS user so that it has minimal 
# accesses needed to retrieve the model
mq.index("my-cool-index").search(
    q = "Chocolate chip cookies",
    modelAuth={
        's3': {
            "aws_access_key_id" : "<SOME ACCESS KEY ID>", 
            "aws_secret_access_key": "<SOME SECRET ACCESS KEY>"
        }
    }
)

Example: Hugging Face (HF)

# Create an index that specifies the non-public location of the model.
# Note the `auth_required` field in `model_properties` which tells Marqo to use
# the modelAuth it finds during search to download the model
mq.create_index(
    index_name="my-cool-index", 
    settings_dict={
        "index_defaults": {
            "treat_urls_and_pointers_as_images": True,
            "model": 'my_hf_model',
            "normalize_embeddings": True,
            "model_properties": {
                {
                    "name": "ViT-B/32",
                    "dimensions": 512,
                    "model_location": {
                        "hf": {
                            "repo_id": "<SOME HF REPO NAME>",
                            "filename": "<THE FILENAME TO DOWNLOAD>",
                        },
                        "auth_required": True
                    },
                    "type": "open_clip",
                }
            }
        }
    }
)

# specify the authorisation needed to access the private model during search:
mq.index("my-cool-index").search(
    q = "Chocolate chip cookies",
    modelAuth={
        'hf': {
            "token" : "<SOME HF TOKEN>", 
        }
    }
)