Skip to content

Text Only Indexes

For documentation on structured indexes see here.

In this section we will go over how to create a simple structured index with a text field to vectorise. This is the simplest form of a structured index. We will talk through each part of the settings object and what it does.

The bare minimum settings object for a structured index with a text field that supports tensor and lexical search is as follows:

import marqo

settings = {
    "type": "structured",
    "allFields": [
        {"name": "text_field", "type": "text", "features": ["lexical_search"]},
    ],
    "tensorFields": ["text_field"],
}

mq = marqo.Client(url="http://localhost:8882")

mq.create_index("my-simple-structured-index", settings_dict=settings)

Index Settings: type

The Index Settings: type field is required and must be set to structured for a structured index.

Index Settings: allFields

The allFields field defines the schema for the index. Each field in the schema must have a name and a type, they can optionally have a features field as well.

The name defines the name of the field in the document. The type defines the datatype of the field. The features field is an array of features that can be applied to the field. In this case we are using the lexical_search feature which allows us to search with the LEXICAL search method.

Index Settings: tensorFields

The tensorFields field is an array of fields that will be vectorised. In this case we are vectorising the text_field field. These fields are available for the TENSOR search method.

Example Add Documents Usage

Because we are using a structured index we don't need to specify the tensor fields. The schema will take care of this for us.

documents = [
    {
        "_id": "1",
        "text_field": "New York",
    },
    {
        "_id": "2",
        "text_field": "Los Angeles",
    },
]

mq.index("my-simple-structured-index").add_documents(documents)

Example search usage

The schema above will allow us to search the text_field field using the LEXICAL search method and the TENSOR search method.

Tensor Search (search_method="TENSOR" is the default):

results = mq.index("my-simple-structured-index").search(q="New York")

Lexical Search:

results = mq.index("my-simple-structured-index").search(
    q="New York", search_method="LEXICAL"
)