Indexes
This page details how to create, delete and retrieve indexes on Marqo Cloud. Find your Marqo Cloud API key through
heading to the Marqo console, in the API keys
tab.
Note: The existing create, delete and modify index endpoints have remained unchanged for Marqo 1.0 indexes. Check out the documentation for 1.0 indexes here. Marqo has introduced new APIs for these endpoints for Marqo 2.0 indexes. These endpoints fit this pattern: /api/v2/indexes/ and are discussed below
Create index
POST https://api.marqo.ai/api/v2/indexes/{index_name}
Create and index with (optional) settings.
This endpoint accepts the application/json
content type.
Marqo Cloud creates dedicated infrastructure for each index. Using the create index endpoint, you can specify the type
of storage for the index storageClass
and the type of inference inferenceType
. The number of storage instances is
defined by numberOfShards
, the number of replicas numberOfReplicas
and the number of Marqo inference nodes
by numberOfInferences
.
Example
curl -XPOST 'https://api.marqo.ai/api/v2/indexes/my-first-index' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
"treatUrlsAndPointersAsImages": false,
"model": "hf/e5-base-v2",
"numberOfShards": 1,
"numberOfReplicas": 0,
"inferenceType": "marqo.CPU.large",
"storageClass": "marqo.basic",
"numberOfInferences": 1
}'
import marqo
mq = marqo.Client("https://api.marqo.ai", api_key="XXXXXXXXXXXXXXX")
index_settings = {
"treatUrlsAndPointersAsImages": False,
"model": "hf/e5-base-v2",
"numberOfShards": 1,
"numberOfReplicas": 0,
"inferenceType": "marqo.CPU.large",
"storageClass": "marqo.basic",
"numberOfInferences": 1
}
mq.create_index("my-first-index", settings_dict=index_settings)
Response: 200 OK
{"acknowledged":true, "shards_acknowledged":true, "index":"my-first-index"}
Path parameters
Name | Type | Description |
---|---|---|
index_name |
String | name of the index |
Body Parameters
The settings for the index. The settings are represented as a nested JSON object.
Name | Type | Default value | Description |
---|---|---|---|
treatUrlsAndPointersAsImages |
Boolean | "" |
Fetch images from pointers |
model |
String | hf/e5-base-v2 |
The model to use to vectorise doc content in add_documents() calls for the index |
modelProperties |
Dictionary | "" |
The model properties object corresponding to model (for custom models) |
normalizeEmbeddings |
Boolean | true |
Normalize the embeddings to have unit length |
textPreprocessing |
Dictionary | "" |
The text preprocessing object |
imagePreprocessing |
Dictionary | "" |
The image preprocessing object |
annParameters |
Dictionary | "" |
The ANN algorithm parameter object |
type |
String | unstructured |
Type of the index |
vectorNumericType |
String | float |
Numeric type for vector encoding |
filterStringMaxLength |
Int | 20 |
Specifies the maximum character length allowed for strings used in filtering queries within unstructured indexes. This means that any string field you intend to use as a filter in these indexes should not exceed 20 characters in length. |
inferenceType |
String | marqo.CPU.small |
Type of inference for the index. Options are "marqo.CPU.small"(deprecated), "marqo.CPU.large", "marqo.GPU". |
storageClass |
String | marqo.basic |
Type of storage for the index. Options are "marqo.basic", "marqo.balanced", "marqo.performance". |
numberOfShards |
Integer | 1 |
The number of shards for the index. |
numberOfReplicas |
Integer | 0 |
The number of replicas for the index. |
numberOfInferences |
Integer | 1 |
The number of inference nodes for the index. |
textChunkPrefix |
String | "" or "passage: " for e5 models |
The prefix added to indexed text document chunks when embedding. |
textQueryPrefix |
String | "" or "query: " for e5 models |
The prefix added to text queries when embedding. |
Text Preprocessing Object
The textPreprocessing
object contains the specifics of how you want the index to preprocess text. The parameters are
as follows:
Name | Type | Default value | Description |
---|---|---|---|
splitLength |
Integer | 2 |
The length of the chunks after splitting by split_method |
splitOverlap |
Integer | 0 |
The length of overlap between adjacent chunks |
splitMethod |
String | sentence |
The method by which text is chunked (character , word , sentence , or passage ) |
Image Preprocessing Object
The imagePreprocessing
object contains the specifics of how you want the index to preprocess images. The parameters
are as follows:
Name | Type | Default value | Description |
---|---|---|---|
patchMethod |
String | null |
The method by which images are chunked (simple or frcnn ) |
ANN Algorithm Parameter object
The annParameters
object contains hyperparameters for the approximate nearest neighbour algorithm used for tensor
storage within Marqo. The parameters are as follows:
Name | Type | Default value | Description |
---|---|---|---|
spaceType |
String | prenormalized-anglar |
The function used to measure the distance between two points in ANN (l1 , l2 , linf , or prenormalized-anglar ). |
parameters |
Dict | "" |
The hyperparameters for the Marqo index's HNSW graphs. |
HNSW Method Parameters Object
parameters
can have the following values:
Name | Type | Default value | Description |
---|---|---|---|
efConstruction |
int | 512 |
The size of the dynamic list used during k-NN graph creation. Higher values lead to a more accurate graph but slower indexing speed. It is recommended to keep this between 2 and 800 (maximum is 4096) |
m |
int | 16 |
The number of bidirectional links that the plugin creates for each new element. Increasing and decreasing this value can have a large impact on memory consumption. Keep this value between 2 and 100. |
Model Properties Object
modelProperties
is a flexible object that is used to set up models that aren't available in Marqo by default (models
available by default are listed here).
The structure of modelProperties
will vary depending on the model.
For OpenCLIP models, see here
for modelProperties
format and example usage.
For Generic SBERT models, see here
for modelProperties
format and example usage.
Below is a sample index settings JSON object. When using the Python client, pass this dictionary as the settings_dict
parameter for the create_index
method.
{
"type": "unstructured",
"vectorNumericType": "float",
"treatUrlsAndPointersAsImages": true,
"model": "open_clip/ViT-L-14/laion2b_s32b_b82k",
"normalizeEmbeddings": true,
"textPreprocessing": {
"splitLength": 2,
"splitOverlap": 0,
"splitMethod": "sentence"
},
"imagePreprocessing": {
"patchMethod": null
},
"annParameters": {
"spaceType": "prenormalized-angular",
"parameters": {
"efConstruction": 512,
"m": 16
}
},
"filterStringMaxLength": 20
}
Modify index
You can modify the settings of an existing index, such as the number of inference nodes or the type of inference node.
PUT https://api.marqo.ai/api/v2/indexes/{index_name}
Example
curl -XPUT 'https://api.marqo.ai/api/v2/indexes/my-first-index' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
"numberOfInferences": 1,
"inferenceType": "marqo.CPU.large"
}'
Response: 200 OK
{"acknowledged":true}
Path parameters
Name | Type | Description |
---|---|---|
index_name |
String | name of the index |
Body Parameters
The settings for the index. The settings are represented as a nested JSON object.
Name | Type | Default value | Description |
---|---|---|---|
inferenceType |
String | marqo.CPU.small |
Type of inference for the index. Options are "marqo.CPU.small"(deprecated), "marqo.CPU.large", "marqo.GPU". |
numberOfInferences |
Integer | 1 |
Defines the number of inference nodes for the index. The minimum value is 0, and the maximum value is 5 by default, but this is dependent on your account limits. |
Delete index
Delete an index.
Note: This operation cannot be undone, and the deleted index can't be recovered
DELETE https://api.marqo.ai/api/v2/indexes/{index_name}
Example
curl -H 'x-api-key: XXXXXXXXXXXXXXX' -XDELETE https://api.marqo.ai/api/v2/indexes/my-first-index
results = mq.index("my-first-index").delete()
Response: 200 OK
{"acknowledged": true}
List indexes
GET https://api.marqo.ai/api/v2/indexes
List indexes
Example
curl -H 'X-API-KEY: XXXXXXXXXXXXXXX' https://api.marqo.ai/api/v2/indexes
mq.get_indexes()
Response: 200 OK
{
"results": [
{
"Created": "2024-01-02T23:03:37.205347",
"indexName": "imageindex",
"numberOfShards": "1",
"numberOfReplicas": "0",
"indexStatus": "READY",
"numberOfInferences": "1",
"storageClass": "BASIC",
"inferenceType": "CPU",
"docs.count": "0",
"store.size": "0",
"docs.deleted": "0",
"search.queryTotal": "0",
"treatUrlsAndPointersAsImages": true,
"marqoEndpoint": "https://imageindex-c8ua99-w8nt2f73.marqo-staging.com",
"type": "unstructured",
"vectorNumericType": "float",
"model": "open_clip/ViT-L-14/laion2b_s32b_b82k",
"normalizeEmbeddings": true,
"textPreprocessing": {
"split_length": "2",
"split_method": "sentence",
"split_overlap": "0"
},
"imagePreprocessing": {},
"annParameters": {
"spaceType": "prenormalized-angular",
"parameters": {
"ef_construction": "128",
"m": "16"
}
},
"marqoVersion": "2.0.2-beta",
"filterStringMaxLength": "20"
}
]
}