Configuring Marqo

Marqo is configured through environment variables passed to the Marqo container when it is run.

Configuring usage limits

Limits can be set to protect the resources of the machine Marqo is running on.

Configuration name	Default	Description
`MARQO_MAX_INDEX_FIELDS`	n/a	Maximum number of fields allowed per index
`MARQO_MAX_DOC_BYTES`	100000	Maximum document size allowed to be indexed
`MARQO_MAX_RETRIEVABLE_DOCS`	n/a	Maximum number of documents allowed to be returned in a single request. The maximum value this can be set to is 10000.
`MARQO_MAX_CUDA_MODEL_MEMORY`	4	Maximum CUDA memory usage (GB) for models in Marqo. For multi-GPU, this is the max memory for each GPU.
`MARQO_MAX_CPU_MODEL_MEMORY`	4	Maximum RAM usage (GB) for models in Marqo.
`MARQO_MAX_VECTORISE_BATCH_SIZE`	16	Maximum size of batch size to process in parallel (when, for example, adding documents ).
`VESPA_POOL_SIZE`	n/a	The size of the connection pool for Vespa operations.
`VESPA_CONTENT_CLUSTER_NAME`	n/a	Name of the Vespa content cluster.
`VESPA_SEARCH_TIMEOUT_MS`	n/a	Amount of time before search request to Vespa times out (milliseconds).
`MARQO_MAX_DOCUMENTS_BATCH_SIZE`	128	Maximum number of documents that can be included in a single request to the `add_documents` or `update_documents` endpoints.

Example

docker run --name marqo -p 8882:8882 \
    -e "MARQO_MAX_INDEX_FIELDS=400" \
    -e "MARQO_MAX_DOC_BYTES=200000" \
    -e "MARQO_MAX_RETRIEVABLE_DOCS=600" \
    -e "MARQO_MAX_CUDA_MODEL_MEMORY=5" \
    -e "VESPA_SEARCH_TIMEOUT_MS=2000" marqoai/marqo:latest

In the above example a marqo container is being run with the following limits:

The max number of fields per index is capped at 400
The max size of an indexed document is 0.2mb
The max number of documents allowed to be returned in a single request is 600
The max CUDA memory usage for models in Marqo is 5GB.
The max number of replicas allowed when creating an index is 2.
The search timeout for Vespa is 2 seconds.

Configure backend communication

This section describes the environment variables that can be used to configure Marqo's communication with the backend. It can be helpful to set these variables when Marqo is running in a container and needs to communicate with a Vespa running on a separate container or a difference host machine.

Configuration name	Default	Description
`VESPA_CONFIG_URL`	`"http://localhost:19071"`	URL for Vespa configuration.
`VESPA_QUERY_URL`	`"http://localhost:8080"`	URL for querying the Vespa instance.
`VESPA_DOCUMENT_URL`	`"http://localhost:8080"`	URL for document operations in the Vespa instance.
`ZOOKEEPER_HOSTS`	n/a	Hosts for the Zookeeper server, no `"https"` or `"http"` required in the string. If not set, Marqo will skip the connection to the Zookeeper server.

Example

In this example, we assume that Marqo is running in a container and needs to communicate with a Vespa running on the host machine but a separate container.

docker run --name marqo -p 8882:8882 --add-host host.docker.internal:host-gateway \
    -e VESPA_CONFIG_URL="http://host.docker.internal:19071" \
    -e VESPA_DOCUMENT_URL="http://host.docker.internal:8080" \
    -e VESPA_QUERY_URL="http://host.docker.internal:8080" \
    -e ZOOKEEPER_HOSTS="host.docker.internal:2181" \
    marqoai/marqo:latest

Configuring preloaded patch models

Variable: MARQO_PATCH_MODELS_TO_PRELOAD
Default value: '[]'
Expected value: A string of comma-separated patch model names. Currently supported patch models are: 'simple', 'overlap', 'fastercnn', 'frcnn', 'marqo-yolo', 'yolox', 'dino-v1', 'dino-v2', 'dino/v1', 'dino/v2'.

This is a list of patch models to load and pre-warm as Marqo starts. This prevents a delay during initial image processing.

Configuring preloaded models

Variable: MARQO_MODELS_TO_PRELOAD
Default value: '["hf/e5-base-v2", "open_clip/ViT-B-32/laion2b_s34b_b79k"]'
Expected value: A JSON-encoded array of strings or objects.

This is a list of models to load and pre-warm as Marqo starts. This prevents a delay during initial search and index commands in actual Marqo usage.

Models in string form must be names of models within the model registry. You can find these models here

Models in object form must have model and modelProperties keys.

Model Object Example (OPEN CLIP model)

'{
    "model": "my-open-clip-1",
    "modelProperties": {
        "name": "ViT-B-32-quickgelu",
        "dimensions": 512,
        "url": "https://github.com/mlfoundations/open_clip/releases/download/v0.2-weights/vit_b_32-quickgelu-laion400m_avg-8a00ab3c.pt",
        "type": "open_clip"
    }
}'

Model Object Example (CLIP model)

'{
    "model": "generic-clip-test-model-2",
    "modelProperties": {
        "name": "ViT-B/32",
        "dimensions": 512,
        "type": "clip",
        "url": "https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt"
    }
}'

Marqo Run Example (containing both string and object)

export MY_MODEL_LIST='[
    "sentence-transformers/stsb-xlm-r-multilingual",
    "hf/e5-base-v2",
    {
        "model": "generic-clip-test-model-2",
        "modelProperties": {
            "name": "ViT-B/32",
            "dimensions": 512,
            "type": "clip",
            "url": "https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt"
        }
    }
]'

docker run --name marqo -p 8882:8882 \
    -e MARQO_MODELS_TO_PRELOAD="$MY_MODEL_LIST" \
    marqoai/marqo:latest

Configuring log level

Variable: MARQO_LOG_LEVEL
Default value: 'info'
Expected value: a str from one of 'error', 'warning', 'info', 'debug'.

This environment variable will change the log level of timing logger and uvicorn logger. A higher log level (e.g., 'error') will reduce the amount of logs in Marqo, while a lower log level ('debug') will record more detailed information in the logs. The default level is 'info'.

Example

docker run --name marqo -p 8882:8882 \
    -e MARQO_LOG_LEVEL='warning' \
    marqoai/marqo:latest

Configuring throttling

Configuration name	Default	Description
`MARQO_ENABLE_THROTTLING`	`"TRUE"`	Adds throttling if `"TRUE"`. Must be a `str`: Either `"TRUE"` or `"FALSE"`.
`MARQO_MAX_CONCURRENT_INDEX`	8	Maximum allowed concurrent indexing threads
`MARQO_MAX_CONCURRENT_SEARCH`	8	Maximum allowed concurrent search threads

These environment variables set Marqo's allowed concurrency across index and search. If these limits are reached, then Marqo will return 429 on subsequent requests. These should be set with respect to available resources of the machine Marqo will be running on.

Example

docker run --name marqo -p 8882:8882 \
    -e MARQO_ENABLE_THROTTLING='TRUE' \
    -e MARQO_MAX_CONCURRENT_SEARCH='10' \
    marqoai/marqo:latest

Marqo inference cache configuration

Configuration name	Default	Description
`MARQO_INFERENCE_CACHE_SIZE`	0 (disabled)	The size (measured by the number of query-embedding pairs) of the marqo inference cache. Set it to a positive integer to enable this feature
`MARQO_INFERENCE_CACHE_TYPE`	`"LRU"` (least recently used)	The eviction policy of the marqo inference cache. Supported types are `"LRU"`, `"LFU"` (least frequently used)

These environment variables configure the size and eviction policy of the Marqo inference cache, which stores results from inference queries to improve search latency. Note that this cache does not apply on the add_documents endpoint. Consider enabling this feature if you frequently encounter a high volume of identical queries. By default, this feature is disabled.

Example

docker run --name marqo -p 8882:8882 \
    -e "MARQO_INFERENCE_CACHE_SIZE=20" \
    -e "MARQO_INFERENCE_CACHE_TYPE=LRU" \
    marqoai/marqo:latest

Other configurations

Configuration name	Default	Description
`MARQO_EF_CONSTRUCTION_MAX_VALUE`	4096	The maximum ef_construction value of Marqo indexes created by this Marqo instance.
`MARQO_MAX_SEARCHABLE_TENSOR_ATTRIBUTES`	null	The maximum allowed number of tensor fields to be searched in a single tensor search query

Configuring Marqo

Configuring usage limits

Example

Configure backend communication

Example

Configuring preloaded patch models

Configuring preloaded models

Model Object Example (OPEN CLIP model)

Model Object Example (CLIP model)

Marqo Run Example (containing both string and object)

Configuring log level

Example

Configuring throttling

Example

Marqo inference cache configuration

Example

Other configurations

Subscribe to our mailing list