Skip to main content

Update Existing Documents

Update an array of documents in a given index.

Each document must contain an _id field to identify the document to update. Only document fields and values you want to update should be in the request. This endpoint only works for existing documents in structured indexes right now.

Use this endpoint to update documents in a structured index or an unstructured index created after Marqo 2.16, by modifying the existing fields or adding new fields to the document. You can only modify or add fields that are not tensor fields or the dependent fields of a multimodal combination field. If the document does not exist, please check the add_documents endpoint. If you need to update a tensor field, a multimodal combination dependent field, or a document in unstructured index created before Marqo 2.16, check the useExistingTensors feature. Unstructured indexes created before Marqo 2.16 do not support this endpoint.

This endpoint accepts the application/json content type.


PATCH /indexes/{index_name}/documents

Path parameters

NameTypeDescription
index_nameStringname of the index (structured index only)

Body

In the RestAPI and for cURL users these parameters are in lowerCamelCase, as presented in the following table. The Python client uses the pythonic snake_case equivalents.

Add documents parametersValue TypeDefault ValueDescription
documentsArray of objectsn/aAn array of documents. Each document is represented as a JSON object. Each document must contain a valid _id field to specify the target document. You only need to include the fields you want to update in the JSON object. You cannot update a tensor field in a structured index.

Response

The response of the update_documens endpoint in Marqo operates on two levels. Firstly, a status code of 200 in the overall response indicates that the batch request has been successfully received and processed by Marqo. The response has the following fields:

Field NameTypeDescription
errorsBooleanIndicates whether any errors occurred during the processing of the batch request.
itemsArrayAn array of objects, each representing the processing status of an individual document in the batch.
processingTimeMsIntegerThe time taken to process the batch request, in milliseconds.
index_nameStringThe name of the index to which the documents were added.

However, a 200 status does not necessarily imply that each individual document within the batch was processed without issues. For each document in the batch, there will be an associated response code that specifies the status of that particular document's processing. These individual response codes provide granular feedback, allowing users to discern which documents were successfully processed, which encountered errors, and the nature of any issues encountered. Each item in the items array has the following fields:

Field NameTypeDescription
_idStringThe ID of the document that was processed.
statusIntegerThe status code of the document processing.
messageStringA message that provides additional information about the processing status of the document. This field only exists when the status is not 200.

Here is the HTTP status code of the individual document responses (non-exhaustive list of status codes):

Status CodeDescription
200The document is updated successfully.
400Bad request. Returned for invalid input (e.g., invalid field types). Inspect message for details.
404The target document is not in the index.
429Marqo vector store receives too many requests. Please try again later.
500Internal error.

Update behavior for unstructured indexes

note

Unstructured indexes created after Marqo 2.16 support the update_documents endpoint. However, to optimize performance, this endpoint returns a 400 Bad Request status code for the individual document if there is an issue updating the target document, without detailed diagnostics. You may receive a 400 status code for any of the following reasons:

  • The document ID specified in the request does not exist in the index (verify and correct the _id field).
  • The request attempts to change the data type of an existing field (ensure consistent field types across updates).
  • The request tries to update a tensor field, a multimodal combination field, or a dependent field of a multimodal combination field (these fields cannot be updated via this endpoint as they contain tensors).

In one specific retriable case, a 400 status code may be returned even if the request is valid. This can happen when another add_documents operation, or an update_documents operation involving map-type fields, is concurrently modifying the same document ID. The error is transient and retrying the request after a short delay will typically resolve the issue.

Example

=== "Marqo Open Source" === "cURL"

# Let's create a structured index an add a document to it
curl -X POST 'http://localhost:8882/indexes/my-first-structured-index' \
-H "Content-Type: application/json" \
-d '{
"type": "structured",
"allFields": [
{"name": "img", "type": "image_pointer"},
{"name": "title", "type": "text"},
{"name": "label", "type": "text", "features": ["filter"]}
],
"model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
"tensorFields": ["img", "title"]
}'

curl -X POST 'http://localhost:8882/indexes/my-first-structured-index/documents' \
-H "Content-Type: application/json" \
-d '{
"documents":[
{
"img": "https://github.com/marqo-ai/marqo/blob/mainline/examples/ImageSearchGuide/data/image0.jpg?raw=true",
"title": "A lady taking a phote",
"label": "lady",
"_id": "1"
},
{
"img": "https://github.com/marqo-ai/marqo/blob/mainline/examples/ImageSearchGuide/data/image1.jpg?raw=true",
"title": "A plane flying in the sky",
"label": "airplane",
"_id": "2"
}
]
}'
# Now let's update the document by changing the label
curl -X PATCH 'http://localhost:8882/indexes/my-first-structured-index/documents' \
-H "Content-Type: application/json" \
-d '{
"documents":[
{
"label": "person",
"_id": "1"
},
{
"_id": "2",
"label": "plane"
}
]
}'

=== "Python"

# Let's create a structured index an add a document to it
mq.create_index(
"my-first-structured-index",
type="structured",
all_fields=[
{"name": "img", "type": "image_pointer"},
{"name": "title", "type": "text"},
{"name": "label", "type": "text", "features": ["filter"]},
],
model="open_clip/ViT-B-32/laion2b_s34b_b79k",
tensor_fields=["img", "title"],
)
mq.index("my-first-structured-index").add_documents(
[
{
"img": "https://github.com/marqo-ai/marqo/blob/mainline/examples/ImageSearchGuide/data/image0.jpg?raw=true",
"title": "A lady taking a phote",
"label": "lady",
"_id": "1",
},
{
"img": "https://github.com/marqo-ai/marqo/blob/mainline/examples/ImageSearchGuide/data/image1.jpg?raw=true",
"title": "A plane flying in the sky",
"label": "airplane",
"_id": "2",
},
]
)
# Now let's update the document by changing the label
mq.index("my-first-structured-index").update_documents(
[{"_id": "1", "label": "person"}, {"_id": "2", "label": "plane"}]
)

=== "Marqo Cloud" For Marqo Cloud, you will need to access the endpoint of your index and replace your_endpoint with this. To do this, visit Find Your Endpoint. You will also need your API Key. To obtain this key visit Find Your API Key. === "cURL"

# Let's create a structured index an add a document to it
curl -X POST 'https://api.marqo.ai/api/v2/indexes/my-first-structured-index' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H "Content-Type: application/json" \
-d '{
"type": "structured",
"allFields": [
{"name": "img", "type": "image_pointer"},
{"name": "title", "type": "text"},
{"name": "label", "type": "text", "features": ["filter"]}
],
"model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
"tensorFields": ["img", "title"]
}'

curl -X POST your_endpoint/indexes/my-first-structured-index/documents' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H "Content-Type: application/json" \
-d '{
"documents":[
{
"img": "https://github.com/marqo-ai/marqo/blob/mainline/examples/ImageSearchGuide/data/image0.jpg?raw=true",
"title": "A lady taking a phote",
"label": "lady",
"_id": "1"
},
{
"img": "https://github.com/marqo-ai/marqo/blob/mainline/examples/ImageSearchGuide/data/image1.jpg?raw=true",
"title": "A plane flying in the sky",
"label": "airplane",
"_id": "2"
}
]
}'
# Now let's update the document by changing the label
curl -X PATCH 'your_endpoint/indexes/my-first-structured-index/documents' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H "Content-Type: application/json" \
-d '{
"documents":[
{
"label": "person",
"_id": "1"
},
{
"_id": "2",
"label": "plane"
}
]
}'

=== "Python"

# Let's create a structured index an add a document to it
mq.create_index(
"my-first-structured-index",
type="structured",
all_fields=[
{"name": "img", "type": "image_pointer"},
{"name": "title", "type": "text"},
{"name": "label", "type": "text", "features": ["filter"]},
],
model="open_clip/ViT-B-32/laion2b_s34b_b79k",
tensor_fields=["img", "title"],
)
mq.index("my-first-structured-index").add_documents(
[
{
"img": "https://github.com/marqo-ai/marqo/blob/mainline/examples/ImageSearchGuide/data/image0.jpg?raw=true",
"title": "A lady taking a phote",
"label": "lady",
"_id": "1",
},
{
"img": "https://github.com/marqo-ai/marqo/blob/mainline/examples/ImageSearchGuide/data/image1.jpg?raw=true",
"title": "A plane flying in the sky",
"label": "airplane",
"_id": "2",
},
]
)
# Now let's update the document by changing the label
mq.index("my-first-structured-index").update_documents(
[{"_id": "1", "label": "person"}, {"_id": "2", "label": "plane"}]
)

Response 200 OK

{
'errors': false,
'index_name': 'my-first-structured-index',
'items': [
{
'_id': '1',
'status': 200
},
{
'_id': '2',
'status': 200
}
],
'processingTimeMs': 20.17
}

The update document endpoint is only available for structured indexes to update the fields of existing documents. In the example, we updated the label of the documents with _id fields "1" and "2". The response shows that the update was successful. These changes are reflected in the index and can be used for search and filtering. Note that you can only update fields that are not tensor fields.

Documents

Parameter: documents

Expected value: Array of documents (default maximum length: 128). Each document is a JSON object that must contain a valid _id field to specify the target document. You only need to include the fields you want to update in the JSON object.

[
{
"Title": "You updated title 1 ",
"Description": "You updated description 1 ",
"_id": "your-target-doc-id-1"
},
{
"Title": "You updated title 2 ",
"Description": "You updated description 2",
"_id": "your-target-doc-id-2"
}
]