Skip to main content

Indexes

This page details how to create, delete and retrieve indexes on Marqo Cloud. Find your Marqo Cloud API key through heading to the Marqo console, in the API keys tab.

Note: The existing create, delete and modify index endpoints have remained unchanged for Marqo 1.0 indexes. Check out the documentation for 1.0 indexes here. Marqo has introduced new APIs for these endpoints for Marqo 2.0 indexes. These endpoints fit this pattern: /api/v2/indexes/ and are discussed below


Create index

POST https://api.marqo.ai/api/v2/indexes/{index_name}

Create and index with (optional) settings. This endpoint accepts the application/json content type.

Marqo Cloud creates dedicated infrastructure for each index. Using the create index endpoint, you can specify the type of storage for the index storageClass and the type of inference inferenceType. The number of storage instances is defined by numberOfShards, the number of replicas numberOfReplicas and the number of Marqo inference nodes by numberOfInferences.

Example

=== "cURL"

curl -XPOST 'https://api.marqo.ai/api/v2/indexes/my-first-index' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
"treatUrlsAndPointersAsImages": false,
"model": "hf/e5-base-v2",
"numberOfShards": 1,
"numberOfReplicas": 0,
"inferenceType": "marqo.CPU.large",
"storageClass": "marqo.basic",
"numberOfInferences": 1
}'

=== "Python"

import marqo
mq = marqo.Client("https://api.marqo.ai", api_key="XXXXXXXXXXXXXXX")
index_settings = {
"treatUrlsAndPointersAsImages": False,
"model": "hf/e5-base-v2",
"numberOfShards": 1,
"numberOfReplicas": 0,
"inferenceType": "marqo.CPU.large",
"storageClass": "marqo.basic",
"numberOfInferences": 1
}
mq.create_index("my-first-index", settings_dict=index_settings)

Response: 200 OK

{"acknowledged":true, "shards_acknowledged":true, "index":"my-first-index"}

Path parameters

NameTypeDescription
index_nameStringname of the index

Body Parameters

The settings for the index. The settings are represented as a nested JSON object.

NameTypeDefault valueDescription
treatUrlsAndPointersAsImagesBoolean""Fetch images from pointers
modelStringhf/e5-base-v2The model to use to vectorise doc content in add_documents() calls for the index
modelPropertiesDictionary""The model properties object corresponding to model (for custom models)
normalizeEmbeddings
BooleantrueNormalize the embeddings to have unit length
textPreprocessingDictionary""The text preprocessing object
imagePreprocessingDictionary""The image preprocessing object
annParametersDictionary""The ANN algorithm parameter object
typeStringunstructuredType of the index
vectorNumericTypeStringfloatNumeric type for vector encoding
filterStringMaxLengthInt20Specifies the maximum character length allowed for strings used in filtering queries within unstructured indexes. This means that any string field you intend to use as a filter in these indexes should not exceed 20 characters in length.
inferenceTypeStringmarqo.CPU.smallType of inference for the index. Options are "marqo.CPU.small"(deprecated), "marqo.CPU.large", "marqo.GPU".
storageClassStringmarqo.basicType of storage for the index. Options are "marqo.basic", "marqo.balanced.storage", "marqo.balanced.throughput", "marqo.performance".
numberOfShardsInteger1The number of shards for the index.
numberOfReplicasInteger0The number of replicas for the index.
numberOfInferencesInteger1The number of inference nodes for the index.
textChunkPrefixString"" or "passage: " for e5 modelsThe prefix added to indexed text document chunks when embedding.
textQueryPrefixString"" or "query: " for e5 modelsThe prefix added to text queries when embedding.

Text Preprocessing Object

The textPreprocessing object contains the specifics of how you want the index to preprocess text. The parameters are as follows:

NameTypeDefault valueDescription
splitLengthInteger2The length of the chunks after splitting by split_method
splitOverlapInteger0The length of overlap between adjacent chunks
splitMethodStringsentenceThe method by which text is chunked (character, word, sentence, or passage)

Image Preprocessing Object

The imagePreprocessing object contains the specifics of how you want the index to preprocess images. The parameters are as follows:

NameTypeDefault valueDescription
patchMethodStringnullThe method by which images are chunked (simple or frcnn)

ANN Algorithm Parameter object

The annParameters object contains hyperparameters for the approximate nearest neighbour algorithm used for tensor storage within Marqo. The parameters are as follows:

NameTypeDefault valueDescription
spaceTypeStringprenormalized-anglarThe function used to measure the distance between two points in ANN (l1, l2, linf, or prenormalized-anglar).
parametersDict""The hyperparameters for the Marqo index's HNSW graphs.

HNSW Method Parameters Object

parameters can have the following values:

NameTypeDefault valueDescription
efConstructionint512The size of the dynamic list used during k-NN graph creation. Higher values lead to a more accurate graph but slower indexing speed. It is recommended to keep this between 2 and 800 (maximum is 4096)
mint16The number of bidirectional links that the plugin creates for each new element. Increasing and decreasing this value can have a large impact on memory consumption. Keep this value between 2 and 100.

Model Properties Object

modelProperties is a flexible object that is used to set up models that aren't available in Marqo by default (models available by default are listed here). The structure of modelProperties will vary depending on the model.

For OpenCLIP models, see here for modelProperties format and example usage.

For Generic SBERT models, see here for modelProperties format and example usage.

Below is a sample index settings JSON object. When using the Python client, pass this dictionary as the settings_dict parameter for the create_index method.

{
"type": "unstructured",
"vectorNumericType": "float",
"treatUrlsAndPointersAsImages": true,
"model": "open_clip/ViT-L-14/laion2b_s32b_b82k",
"normalizeEmbeddings": true,
"textPreprocessing": {
"splitLength": 2,
"splitOverlap": 0,
"splitMethod": "sentence"
},
"imagePreprocessing": {
"patchMethod": null
},
"annParameters": {
"spaceType": "prenormalized-angular",
"parameters": {
"efConstruction": 512,
"m": 16
}
},
"filterStringMaxLength": 20
}

Modify index

You can modify the settings of an existing index, such as the number of inference nodes or the type of inference node.

PUT https://api.marqo.ai/api/v2/indexes/{index_name}

Example

=== "cURL"

curl -XPUT 'https://api.marqo.ai/api/v2/indexes/my-first-index' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
"numberOfInferences": 1,
"inferenceType": "marqo.CPU.large"
}'

Response: 200 OK

{"acknowledged":true}

Path parameters

NameTypeDescription
index_nameStringname of the index

Body Parameters

The settings for the index. The settings are represented as a nested JSON object.

NameTypeDefault valueDescription
inferenceTypeStringmarqo.CPU.smallType of inference for the index. Options are "marqo.CPU.small"(deprecated), "marqo.CPU.large", "marqo.GPU".
numberOfInferencesInteger1Defines the number of inference nodes for the index. The minimum value is 0, and the maximum value is 5 by default, but this is dependent on your account limits.

Delete index

Delete an index.

Note: This operation cannot be undone, and the deleted index can't be recovered

DELETE https://api.marqo.ai/api/v2/indexes/{index_name}

Example

=== "cURL"

curl -H 'x-api-key: XXXXXXXXXXXXXXX' -XDELETE https://api.marqo.ai/api/v2/indexes/my-first-index

=== "Python"

results = mq.index("my-first-index").delete()

Response: 200 OK

{"acknowledged": true}

List indexes

GET https://api.marqo.ai/api/v2/indexes

List indexes

Example

=== "cURL"

curl -H 'X-API-KEY: XXXXXXXXXXXXXXX' https://api.marqo.ai/api/v2/indexes

=== "Python"

mq.get_indexes()

Response: 200 OK

{
"results": [
{
"Created": "2024-01-02T23:03:37.205347",
"indexName": "imageindex",
"numberOfShards": "1",
"numberOfReplicas": "0",
"indexStatus": "READY",
"numberOfInferences": "1",
"storageClass": "BASIC",
"inferenceType": "CPU",
"docs.count": "0",
"store.size": "0",
"docs.deleted": "0",
"search.queryTotal": "0",
"treatUrlsAndPointersAsImages": true,
"marqoEndpoint": "https://imageindex-c8ua99-w8nt2f73.marqo-staging.com",
"type": "unstructured",
"vectorNumericType": "float",
"model": "open_clip/ViT-L-14/laion2b_s32b_b82k",
"normalizeEmbeddings": true,
"textPreprocessing": {
"split_length": "2",
"split_method": "sentence",
"split_overlap": "0"
},
"imagePreprocessing": {},
"annParameters": {
"spaceType": "prenormalized-angular",
"parameters": {
"ef_construction": "128",
"m": "16"
}
},
"marqoVersion": "2.0.2-beta",
"filterStringMaxLength": "20"
}
]
}