Skip to content

Indexes

This page details how to create, delete and retrieve indexes on Marqo Cloud. Find your Marqo Cloud API key through heading to the Marqo console, in the API keys tab.

Note: The existing create, delete and modify index endpoints have remained unchanged for Marqo 1.0 indexes. Check out the documentation for 1.0 indexes here. Marqo has introduced new APIs for these endpoints for Marqo 2.0 indexes. These endpoints fit this pattern: /api/v2/indexes/ and are discussed below


Create index

POST https://api.marqo.ai/api/v2/indexes/{index_name}

Create and index with (optional) settings. This endpoint accepts the application/json content type.

Marqo Cloud creates dedicated infrastructure for each index. Using the create index endpoint, you can specify the type of storage for the index storageClass and the type of inference inferenceType. The number of storage instances is defined by numberOfShards, the number of replicas numberOfReplicas and the number of Marqo inference nodes by numberOfInferences.

Example

curl -XPOST 'https://api.marqo.ai/api/v2/indexes/my-first-index' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "treatUrlsAndPointersAsImages": false,
  "model": "hf/e5-base-v2",
  "numberOfShards": 1,
  "numberOfReplicas": 0,
  "inferenceType": "marqo.CPU.small",
  "storageClass": "marqo.basic",
  "numberOfInferences": 1
}'
import marqo
mq = marqo.Client("https://api.marqo.ai", api_key="XXXXXXXXXXXXXXX")
index_settings = {
    "treatUrlsAndPointersAsImages": False,
    "model": "hf/e5-base-v2",
    "numberOfShards": 1,
    "numberOfReplicas": 0,
    "inferenceType": "marqo.CPU.small",
    "storageClass": "marqo.basic",
    "numberOfInferences": 1
}
mq.create_index("my-first-index", settings_dict=index_settings)

Response: 200 OK

{"acknowledged":true, "shards_acknowledged":true, "index":"my-first-index"}

Path parameters

Name Type Description
index_name String name of the index

Body Parameters

The settings for the index. The settings are represented as a nested JSON object.

Name Type Default value Description
treatUrlsAndPointersAsImages Boolean "" Fetch images from pointers
model String hf/e5-base-v2 The model to use to vectorise doc content in add_documents() calls for the index
modelProperties Dictionary "" The model properties object corresponding to model (for custom models)
normalizeEmbeddings
Boolean true Normalize the embeddings to have unit length
textPreprocessing Dictionary "" The text preprocessing object
imagePreprocessing Dictionary "" The image preprocessing object
annParameters Dictionary "" The ANN algorithm parameter object
type String unstructured Type of the index
vectorNumericType String float Numeric type for vector encoding
filterStringMaxLength Int 20 Specifies the maximum character length allowed for strings used in filtering queries within unstructured indexes. This means that any string field you intend to use as a filter in these indexes should not exceed 20 characters in length.
inferenceType String marqo.CPU.small Type of inference for the index. Options are "marqo.CPU.small", "marqo.CPU.large", "marqo.GPU".
storageClass String marqo.basic Type of storage for the index. Options are "marqo.basic", "marqo.balanced", "marqo.performance".
numberOfShards Integer 1 The number of shards for the index.
numberOfReplicas Integer 0 The number of replicas for the index.
numberOfInferences Integer 1 The number of inference nodes for the index.
textChunkPrefix String "" or "passage: " for e5 models The prefix added to indexed text document chunks when embedding.
textQueryPrefix String "" or "query: " for e5 models The prefix added to text queries when embedding.

Text Preprocessing Object

The textPreprocessing object contains the specifics of how you want the index to preprocess text. The parameters are as follows:

Name Type Default value Description
splitLength Integer 2 The length of the chunks after splitting by split_method
splitOverlap Integer 0 The length of overlap between adjacent chunks
splitMethod String sentence The method by which text is chunked (character, word, sentence, or passage)

Image Preprocessing Object

The imagePreprocessing object contains the specifics of how you want the index to preprocess images. The parameters are as follows:

Name Type Default value Description
patchMethod String null The method by which images are chunked (simple or frcnn)

ANN Algorithm Parameter object

The annParameters object contains hyperparameters for the approximate nearest neighbour algorithm used for tensor storage within Marqo. The parameters are as follows:

Name Type Default value Description
spaceType String prenormalized-anglar The function used to measure the distance between two points in ANN (l1, l2, linf, or prenormalized-anglar).
parameters Dict "" The hyperparameters for the Marqo index's HNSW graphs.

HNSW Method Parameters Object

parameters can have the following values:

Name Type Default value Description
efConstruction int 512 The size of the dynamic list used during k-NN graph creation. Higher values lead to a more accurate graph but slower indexing speed. It is recommended to keep this between 2 and 800 (maximum is 4096)
m int 16 The number of bidirectional links that the plugin creates for each new element. Increasing and decreasing this value can have a large impact on memory consumption. Keep this value between 2 and 100.

Model Properties Object

modelProperties is a flexible object that is used to set up models that aren't available in Marqo by default (models available by default are listed here). The structure of modelProperties will vary depending on the model.

For Open CLIP models, see here for modelProperties format and example usage.

For Generic SBERT models, see here for modelProperties format and example usage.

Below is a sample index settings JSON object. When using the Python client, pass this dictionary as the settings_dict parameter for the create_index method.

{
  "type": "unstructured",
  "vectorNumericType": "float",
  "treatUrlsAndPointersAsImages": true,
  "model": "open_clip/ViT-L-14/laion2b_s32b_b82k",
  "normalizeEmbeddings": true,
  "textPreprocessing": {
    "splitLength": 2,
    "splitOverlap": 0,
    "splitMethod": "sentence"
  },
  "imagePreprocessing": {
    "patchMethod": null
  },
  "annParameters": {
    "spaceType": "prenormalized-angular",
    "parameters": {
      "efConstruction": 512,
      "m": 16
    }
  },
  "filterStringMaxLength": 20
}

Delete index

Delete an index.

Note: This operation cannot be undone, and the deleted index can't be recovered

DELETE https://api.marqo.ai/api/v2/indexes/{index_name}

Example

curl -H 'x-api-key: XXXXXXXXXXXXXXX' -XDELETE https://api.marqo.ai/api/v2/indexes/my-first-index
results = mq.index("my-first-index").delete()

Response: 200 OK

{"acknowledged": true}

List indexes

GET https://api.marqo.ai/api/v2/indexes

List indexes

Example

curl -H 'X-API-KEY: XXXXXXXXXXXXXXX' https://api.marqo.ai/api/v2/indexes
mq.get_indexes()

Response: 200 OK

{
  "results": [
    {
      "Created": "2024-01-02T23:03:37.205347",
      "indexName": "imageindex",
      "numberOfShards": "1",
      "numberOfReplicas": "0",
      "indexStatus": "READY",
      "numberOfInferences": "1",
      "storageClass": "BASIC",
      "inferenceType": "CPU.SMALL",
      "docs.count": "0",
      "store.size": "0",
      "docs.deleted": "0",
      "search.queryTotal": "0",
      "treatUrlsAndPointersAsImages": true,
      "marqoEndpoint": "https://imageindex-c8ua99-w8nt2f73.marqo-staging.com",
      "type": "unstructured",
      "vectorNumericType": "float",
      "model": "open_clip/ViT-L-14/laion2b_s32b_b82k",
      "normalizeEmbeddings": true,
      "textPreprocessing": {
        "split_length": "2",
        "split_method": "sentence",
        "split_overlap": "0"
      },
      "imagePreprocessing": {},
      "annParameters": {
        "spaceType": "prenormalized-angular",
        "parameters": {
          "ef_construction": "128",
          "m": "16"
        }
      },
      "marqoVersion": "2.0.2-beta",
      "filterStringMaxLength": "20"
    }
  ]
}