Analysing Processing Time
Marqo provides a telemetry
query parameter on the documents and search routes to help you better understand the processing time of your requests.
Getting Telemetry
If using the Python client then the telemetry is enabled for all requestst with it during the client instantiation.
import marqo
mq = marqo.Client(return_telemetry=True)
If using the REST API directly then you can enable the telemetry by adding the telemetry
query parameter to the request. For search and add documents.
cURL -XPOST 'http://localhost:8882/indexes/my-first-index/documents?telemtry=true' \
-H 'Content-type:application/json' -d '
{
"documents": [
{
"Title": "The Travels of Marco Polo",
"Description": "A 13th-century travelogue describing the travels of Polo",
"Genre": "History"
},
{
"Title": "Extravehicular Mobility Unit (EMU)",
"Description": "The EMU is a spacesuit that provides environmental protection",
"_id": "article_591",
"Genre": "Science"
}
],
"tensorFields": ["Description"]
}'
cURL -XPOST 'http://localhost:8882/indexes/my-first-index/search?telemetry=true' \
-H 'Content-type:application/json' -d '
{
"q": "what is the best outfit to wear on the moon?",
"limit": 10,
"offset": 0,
"showHighlights": true,
"searchMethod": "TENSOR",
"attributesToRetrieve": ["Title", "Description"]
}'
Understanding Telemetry
Search
For search requests, the telemetry will return a new telemetry
key in the response with data as follows:
{
"timesMs": {
"search.vector_inference_full_pipeline": 11.1809599998905,
"search.vector.processing_before_vespa": 12.090376000131,
"search.vector.vespa": 17.501166999863926,
"search.vector.postprocess": 1.4460830000189162,
"POST /indexes/test-index/search": 84.4371719998087
}
}
search.vector_inference_full_pipeline
: The total time taken for the full pipeline of the vector search, this is the time it takes Marqo to do inference on your query.search.vector.processing_before_vespa
: The time taken for processing before Vespa this includes any steps around the inference.search.vector.vespa
: The time taken for the Vespa database to perform the HNSW search.search.vector.postprocess
: The time taken for post-processing the request.POST /indexes/test-index/search
: The total time taken for the search request.
Documents
For adding documents the telemetry will return a new telemetry
key in the response with data as follows:
{
"timesMs": {
"image_download.imageserver.com/image1.jpg": 1185.17400199994,
"image_download.thread_time": [
1186.053542999844,
524.2454180001914
],
"image_download.imageserver.com/image2.jpg": 524.1457510001055,
"image_download.full_time": 1186.757126999964,
"add_documents.create_vectors": [
173.112292000058,
238.6752099999976
],
"add_documents.processing_before_vespa": 1425.432337999883,
"add_documents.vespa._bulk": 17.816249999896172,
"add_documents.postprocess": 0.007333999974434846,
"POST /indexes/test-index/documents": 1443.611254000052
}
}
image_download.imageserver.com/image1.jpg
: The time taken to download the imageimage1.jpg
. (Images are downloaded in parallel)image_download.imageserver.com/image2.jpg
: The time taken to download the imageimage2.jpg
. (Images are downloaded in parallel)image_download.thread_time
: The time taken to download the images in parallel.image_download.full_time
: The total time taken to download the images.add_documents.create_vectors
: The time taken to create the vectors for the documents, this is the pre-processing and inference with the model as well as any weighted combinations.add_documents.processing_before_vespa
: The time taken for processing before Vespa this includes all image downloading and inference.add_documents.vespa._bulk
: The time taken for Vespa to ingest the documents, this is the insertion of the data and the vector indexing in the HNSW index.add_documents.postprocess
: The time taken for post-processing the request.POST /indexes/test-index/documents
: The total time taken for the add documents request.