Skip to content

Get One Document

Get one document

Gets a document using its ID.


GET /indexes/{index_name}/documents/{document_id}

Path parameters

Name Type Description
index_name String name of the index
document_id String ID of the document

Query parameters

Search parameter Type Default value Description
expose_facets
Boolean False If true, the document's tensor facets are returned. This is a list of objects. Each facet object contains document data and its associated embedding (found in the facet's _embedding field)

Example

curl -XGET 'http://localhost:8882/indexes/my-first-index/documents/article_591?expose_facets=true'
mq.index("my-first-index").get_document(
    document_id="article_591",
    expose_facets=True
)

For Marqo Cloud, you will need to access the endpoint of your index and replace your_endpoint with this. To do this, visit Find Your Endpoint. You will also need your API Key. To obtain this key visit Find Your API Key.

curl -XGET 'your_endpoint/indexes/my-first-index/documents/article_591?expose_facets=true' \
-H 'x-api-key: XXXXXXXXXXXXXXX' \
mq.index("my-first-index").get_document(
    document_id="article_591",
    expose_facets=True
)

Response: 200 OK

{'Blurb': 'A rocket car is a car powered by a rocket engine. This treatise '
          'proposes that rocket cars are the inevitable future of land-based '
          'transport.',
 'Title': 'Treatise on the viability of rocket cars',
 '_id': 'article_152',
 '_tensor_facets': [{'Title': 'Treatise on the viability of rocket cars',
                     '_embedding': [-0.10393160581588745,
                                    0.0465407557785511,
                                    -0.01760256476700306,
                                    ...]},
                    {'Blurb': 'A rocket car is a car powered by a rocket '
                              'engine. This treatise proposes that rocket cars '
                              'are the inevitable future of land-based '
                              'transport.',
                     '_embedding': [-0.045681700110435486,
                                    0.056278493255376816,
                                    0.022254955023527145,
                                    ...]}]
}
In this example, the GET document request was sent with the expose_facets parameter set to true. The _tensor_facets field is returned as a result. Within each facet, there is a key-value pair that holds the content of the facet, and an _embedding field, which is the content's vector representation.

Get custom vector document

The custom_vector document field is special, as the content structure is not stored exactly the same as how it was input. Instead, the content field of the dictionary is stored as the value for the whole field (for use in lexical search, filtering, and highlights). The GET document request can be used to retrieve the vector field as the _embedding field, when the expose_facets parameter is set to true.

Example

# Random vector for example purposes. replace these with your own.
example_vector_1 = [i for i in range(512)]

# Add custom vector document
mq.index("my-first-index").add_documents(
    [
        {
            "_id": "doc_0",
            "my_custom_vector": {
                "vector": example_vector_1,  # Put your custom generated vector here
                "content": "Your content goes here!",
            },
        }
    ],
    tensor_fields=["my_custom_vector"],
    mappings={"my_custom_vector": {"type": "custom_vector"}},
)

# Get the custom vector document
mq.index("my-first-index").get_document(document_id="doc_0", expose_facets=True)

Response: 200 OK

{
'_id': 'doc_0',
'my_custom_vector': 'Your content goes here!',
'_tensor_facets': [{'my_custom_vector': 'Your content goes here!',
                    '_embedding': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511]
                    }]
}
Note that you can no longer fetch the content and vector subfields, as they have been restructured in this way.

Example: Getting a multimodal document with video or audio

curl -XGET 'http://localhost:8882/indexes/my-first-index/documents/multimodal_document?expose_facets=true'
mq.index("my-first-index").get_document(
    document_id="multimodal_document",
    expose_facets=True
)

Response: 200 OK

{
    'video_field': 'https://example.com/video.mp4', 
    '_id': '4', 
    '_tensor_facets': [
        {'video_field': 'start_0.00::end_10.00', 
        '_embedding': [
            -0.9606754779815674, 
            -0.17251494526863098, 
            0.5819228291511536, 
            ...]},
        {'video_field': 'start_6.56::end_16.56', 
        '_embedding': [
            1.3152823448181152, 
            0.7909719347953796, 
            -1.2539052963256836, 
            ...]},
        ]
}
In this example, we have a multimodal document with one video field of length 16.56 seconds. The videoPreprocessing setting was set to splitLength=10 and splitOverlap=0. Therefore, the video was split into 2 chunks: 0-10 seconds, 10-16.56 seconds. The _tensor_facets field is returned as a field in the result, with each chunk's start and end times, and its vector representation.