Skip to content

Best Practices for Marqo Cloud

Our Guides & How To section contains great information regarding best practices when working with Marqo. This page is dedicated to Marqo Cloud specific best practices.

Production Indexes

For production indexes, we recommend using the URL of your Marqo index endpoint, rather than using the "api.marqo.ai" endpoint when utilizing the py-marqo client. Our control plane is very reliable, but by not using the control plane endpoint, you're eliminating an extra possible point of failure for your production index.

To access the endpoint of your index, visit Find Your Endpoint.

curl -XPOST 'your_endpoint/indexes/my-first-index/documents' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
"documents": [ 
    {
        "Title": "The Travels of Marco Polo",
        "Description": "A 13th-century travelogue describing the travels of Polo",
        "Genre": "History"
        }, 
    {
        "Title": "Extravehicular Mobility Unit (EMU)",
        "Description": "The EMU is a spacesuit that provides environmental protection",
        "_id": "article_591",
        "Genre": "Science"
    }
],
"tensorFields": ["Description"]
}'
mq = marqo.Client("your_endpoint", api_key="XXXXXXXXXXXXXXX")
mq.index("my-first-index").add_documents([
    {
        "Title": "The Travels of Marco Polo",
        "Description": "A 13th-century travelogue describing the travels of Polo",
        "Genre": "History"
    },
    {
        "Title": "Extravehicular Mobility Unit (EMU)",
        "Description": "The EMU is a spacesuit that provides environmental protection",
        "_id": "article_591",
        "Genre": "Science"
    }],
    tensor_fields=["Description"]
)

Replace your_endpoint with your actual endpoint URL and api_key with your API Key.

Pinning py-marqo

We recommend pinning the version of py-marqo when working with production indexes and environments.

marqo==3.8.1

We occasionally release new major versions of the python client, which have breaking changes. This is indicated by the first number in client's semantic version, for example if the version upgrades from 3.x.y to 4.0.0. If the version is not pinned and your production system pulls the latest version and a major version change has happened, your production system may break.

Production Systems with High Availability Requirements

For production systems with high availability requirements, we strongly recommend creating an index with a high availability configuration. This is means the index has:

  • At least 2 GPU or CPU.Large inference nodes
  • Balanced or performance storage, with at least 1 replica

This ensures that the underlying servers are distributed across at least 2 availability zones, and can withstand a complete outage of one of the zones.