Skip to content

Indexes

List indexes

GET /indexes
List indexes

Example

curl http://localhost:8882/indexes
mq.get_indexes()

Response: 200 OK

{
  "results": [
    {
      "index_name": "Book Collection"
    },
    {
      "index_name": "Animal facts"
    }
  ]
}

Create index

By default the settings look like this. Settings can be set as the index is created.

POST /indexes/{index_name}

Create and index with (optional) settings. This endpoint accepts the application/json content type.

Path parameters

Name Type Description
index_name String name of the index

Body Parameters

The settings for the index. The settings are represented as a nested JSON object.

Name Type Default value Description
index_defaults Dictionary "" The index defaults object
number_of_shards Integer 5 The number of shards for the index
number_of_replicas Integer 1 The number of replicas for the index

Index Defaults Object

The index_defaults object contains the default settings for the index. The parameters are as follows:

Name Type Default value Description
treat_urls_and_pointers_as_images Boolean "" Fetch images from pointers
model String hf/all_datasets_v4_MiniLM-L6 The model to use for the index
normalize_embeddings Boolean true Normalize the embeddings to have unit length
text_preprocessing Dictionary "" The text preprocessing object
image_preprocessing Dictionary "" The image preprocessing object
ann_parameters Dictionary "" The ANN algorithm parameter object

Text Preprocessing Object

The text_preprocessing object contains the specifics of how you want the index to preprocess text. The parameters are as follows:

Name Type Default value Description
split_length Integer 2 The length of the chunks after splitting by split_method
split_overlap Integer 0 The length of overlap between adjacent chunks
split_method String sentence The method by which text is chunked (character, word, sentence, or passage)

Image Preprocessing Object

The image_preprocessing object contains the specifics of how you want the index to preprocess images. The parameters are as follows:

Name Type Default value Description
patch_method String null The method by which images are chunked (simple or frcnn)

ANN Algorithm Parameter object

The ann_parameters object contains hyperparameters for the approximate nearest neighbour algorithm used for tensor storage within Marqo. The parameters are as follows:

Name Type Default value Description
space_type String cosinesimil The function used to measure the distance between two points in ANN (l1, l2, linf, or cosinesimil).
parameters Dict "" The hyperparameters for the ANN method (which is always hnsw for Marqo).

HNSW Method Parameters Object

method_parameters can have the following values:

Name Type Default value Description
ef_construction int 128 The size of the dynamic list used during k-NN graph creation. Higher values lead to a more accurate graph but slower indexing speed. It is recommended to keep this between 2 and 800 (maximum is 4096)
m int 16 The number of bidirectional links that the plugin creates for each new element. Increasing and decreasing this value can have a large impact on memory consumption. Keep this value between 2 and 100.

Below is a sample index settings JSON object. When using the Python client, pass this dictionary as the settings_dict parameter for the create_index method.

{
    "index_defaults": {
        "treat_urls_and_pointers_as_images": false,
        "model": "hf/all_datasets_v4_MiniLM-L6",
        "normalize_embeddings": true,
        "text_preprocessing": {
            "split_length": 2,
            "split_overlap": 0,
            "split_method": "sentence"
        },
        "image_preprocessing": {
            "patch_method": null
        },
        "ann_parameters" : {
            "space_type": "cosinesimil",
            "parameters": {
                "ef_construction": 128,
                "m": 16
            }
        }
    },
    "number_of_shards": 5
}

Example

curl -XPOST 'http://localhost:8882/indexes/my-first-index' -H 'Content-type:application/json' -d '
{
"index_defaults": {
    "treat_urls_and_pointers_as_images": false,
    "model": "hf/all_datasets_v4_MiniLM-L6",
    "normalize_embeddings": true,
    "text_preprocessing": {
        "split_length": 2,
        "split_overlap": 0,
        "split_method": "sentence"
    },
    "image_preprocessing": {
        "patch_method": null
    },
    "ann_parameters" : {
        "space_type": "cosinesimil",
        "parameters": {
            "ef_construction": 128,
            "m": 16
        }
    }
},
"number_of_shards": 5
}'
index_settings = {
"index_defaults": {
    "treat_urls_and_pointers_as_images": False,
    "model": "hf/all_datasets_v4_MiniLM-L6",
    "normalize_embeddings": True,
    "text_preprocessing": {
        "split_length": 2,
        "split_overlap": 0,
        "split_method": "sentence"
    },
    "image_preprocessing": {
        "patch_method": None
    },
    "ann_parameters" : {
        "space_type": "cosinesimil",
        "parameters": {
            "ef_construction": 128,
            "m": 16
        }
    }
},
"number_of_shards": 5
}
mq.create_index("my-first-index", settings_dict=index_settings)