Skip to content

Mappings

The mappings object is a parameter (mappings) for an add_documents call. Mappings can be used for granular control over a field. Currently, it is supported for the multimodal_combination, custom_vector, and text_field field types.

When creating a structured index you define weights for a multimodal field under dependent fields. When adding documents mappings is optional with structured indexes and is only needed if the user needs to override default multimodal weights defined at index creation time.

Mappings is used to define custom_vector fields for unstructured indexes only. For structured indexes, do not include custom_vector fields in mappings. Instead, declare them as fields during index creation.

Language mappings are only supported for unstructured indexes created with Marqo 2.16 or later.


Mappings object

Multimodal Combination Mappings

Defining the mapping for multimodal_combination fields:

my_mappings = {
    "my_combination_field": {
        "type": "multimodal_combination",
        "weights": {"My_image": 0.5, "Some_text": 0.5},
    },
    "my_2nd_combination_field": {
        "type": "multimodal_combination",
        "weights": {"Title": -2.5, "Description": 0.3},
    },
}

Custom Vector Mappings

Defining the mapping for custom_vector fields (in an unstructured index):

my_mappings = {
    "my_custom_audio_vector_1": {"type": "custom_vector"},
    "my_custom_audio_vector_2": {"type": "custom_vector"},
}

Adding custom vector documents using that mapping object:

Unstructured Index

# Random vectors for example purposes. replace these with your own.
example_vector_1 = [i for i in range(512)]
example_vector_2 = [1 / (i + 1) for i in range(512)]

# Create the unstructured index
mq = marqo.Client("http://localhost:8882", api_key=None)
settings = {
    "type": "unstructured",
    "model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
}
mq.create_index("my-custom-vector-index", settings_dict=settings)

# Add the custom vectors
mq.index("my-custom-vector-index").add_documents(
    documents=[
        {
            "_id": "doc1",
            "my_custom_audio_vector_1": {
                # Put your own vector (of correct length) here.
                "vector": example_vector_1,
                "content": "Singing audio file",
            },
        },
        {
            "_id": "doc2",
            "my_custom_audio_vector_2": {
                # Put your own vector (of correct length) here.
                "vector": example_vector_2,
                "content": "Podcast audio file",
            },
        },
    ],
    tensor_fields=["my_custom_audio_vector_1", "my_custom_audio_vector_2"],
    mappings=my_mappings,
)

For Marqo Cloud, you will need to access the endpoint of your index and replace your_endpoint with this. To do this, visit Find Your Endpoint. You will also need your API Key. To obtain this key visit Find Your API Key.

# Random vectors for example purposes. replace these with your own.
example_vector_1 = [i for i in range(512)]
example_vector_2 = [1 / (i + 1) for i in range(512)]

# Create the unstructured index
mq = marqo.Client("https://api.marqo.ai", api_key="XXXXXXXXXXXXXXX")
settings = {
    "type": "unstructured",
    "model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
}
mq.create_index("my-custom-vector-index", settings_dict=settings)

# Add the custom vectors
mq.index("my-custom-vector-index").add_documents(
    documents=[
        {
            "_id": "doc1",
            "my_custom_audio_vector_1": {
                # Put your own vector (of correct length) here.
                "vector": example_vector_1,
                "content": "Singing audio file",
            },
        },
        {
            "_id": "doc2",
            "my_custom_audio_vector_2": {
                # Put your own vector (of correct length) here.
                "vector": example_vector_2,
                "content": "Podcast audio file",
            },
        },
    ],
    tensor_fields=["my_custom_audio_vector_1", "my_custom_audio_vector_2"],
    mappings=my_mappings,
)

Structured Index

# Random vectors for example purposes. replace these with your own.
example_vector_1 = [i for i in range(512)]
example_vector_2 = [1 / (i + 1) for i in range(512)]

# Create the structured index
mq = marqo.Client("http://localhost:8882", api_key=None)
settings = {
    "type": "structured",
    "model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
    "allFields": [
        {"name": "my_custom_audio_vector_1", "type": "custom_vector"},
        {"name": "my_custom_audio_vector_2", "type": "custom_vector"},
    ],
    "tensorFields": ["my_custom_audio_vector_1", "my_custom_audio_vector_2"],
}
mq.create_index("my-structured-custom-vector-index", settings_dict=settings)

# Add the custom vectors
mq.index("my-structured-custom-vector-index").add_documents(
    documents=[
        {
            "_id": "doc1",
            "my_custom_audio_vector_1": {
                # Put your own vector (of correct length) here.
                "vector": example_vector_1,
                "content": "Singing audio file",
            },
        },
        {
            "_id": "doc2",
            "my_custom_audio_vector_2": {
                # Put your own vector (of correct length) here.
                "vector": example_vector_2,
                "content": "Podcast audio file",
            },
        },
    ]
)

For Marqo Cloud, you will need to access the endpoint of your index and replace your_endpoint with this. To do this, visit Find Your Endpoint. You will also need your API Key. To obtain this key visit Find Your API Key.

# Random vectors for example purposes. replace these with your own.
example_vector_1 = [i for i in range(512)]
example_vector_2 = [1 / (i + 1) for i in range(512)]

# Create the structured index
mq = marqo.Client("https://api.marqo.ai", api_key="XXXXXXXXXXXXXXX")
settings = {
    "type": "structured",
    "model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
    "allFields": [
        {"name": "my_custom_audio_vector_1", "type": "custom_vector"},
        {"name": "my_custom_audio_vector_2", "type": "custom_vector"},
    ],
    "tensorFields": ["my_custom_audio_vector_1", "my_custom_audio_vector_2"],
}
mq.create_index("my-structured-custom-vector-index", settings_dict=settings)

# Add the custom vectors
mq.index("my-structured-custom-vector-index").add_documents(
    documents=[
        {
            "_id": "doc1",
            "my_custom_audio_vector_1": {
                # Put your own vector (of correct length) here.
                "vector": example_vector_1,
                "content": "Singing audio file",
            },
        },
        {
            "_id": "doc2",
            "my_custom_audio_vector_2": {
                # Put your own vector (of correct length) here.
                "vector": example_vector_2,
                "content": "Podcast audio file",
            },
        },
    ]
)

Text Field Language Mappings

For unstructured indexes created with Marqo 2.16 or later, you can specify a language for text fields to control lexical search behavior:

my_mappings = {
    "title": {"type": "text_field", "language": "fr"},
    "description": {"type": "text_field", "language": "es"},
    "content": {"type": "text_field", "language": "en"},
}

Language mappings affect how text is processed for lexical search operations. The specified language is used for:

  • Tokenization
  • Stemming
  • Stop word removal
  • Other language-specific text processing

Notes

  • A field's language is set the first time it is indexed. Attempting to index the same field with a different language will result in a 400 error.
  • If no language is specified for a new field, that field will use automatic language detection.
  • Omitting the language in subsequent mappings uses the previously set language.

Supported Languages

The following language codes are supported:

  • Arabic (ar)
  • Catalan (ca)
  • Danish (da)
  • Dutch (nl)
  • English (en)
  • Finnish (fi)
  • French (fr)
  • German (de)
  • Greek (el)
  • Hungarian (hu)
  • Indonesian (id)
  • Irish (ga)
  • Italian (it)
  • Norwegian (nb)
  • Portuguese (pt)
  • Romanian (ro)
  • Russian (ru)
  • Spanish (es)
  • Swedish (sv)
  • Turkish (tr)

Example: Adding Multilingual Documents

import marqo

mq = marqo.Client("http://localhost:8882", api_key=None)

# Create unstructured index
mq.create_index(
    index_name="multilingual-index",
    type="unstructured",
    model="hf/e5-base-v2"
)

# Define language mappings
language_mappings = {
    "title_en": {"type": "text_field", "language": "en"},
    "title_fr": {"type": "text_field", "language": "fr"},
    "title_es": {"type": "text_field", "language": "es"},
}

# Add documents with language mappings
mq.index("multilingual-index").add_documents(
    documents=[
        {
            "_id": "doc1",
            "title_en": "Brown shoes for men",
            "title_fr": "Chaussures marron pour hommes",
            "title_es": "Zapatos marrones para hombres"
        }
    ],
    tensor_fields=["title_en"],
    mappings=language_mappings
)
import marqo

mq = marqo.Client("https://api.marqo.ai", api_key="XXXXXXXXXXXXXXX")

# Create unstructured index
mq.create_index(
    index_name="multilingual-index",
    type="unstructured", 
    model="hf/e5-base-v2"
)

# Define language mappings
language_mappings = {
    "title_en": {"type": "text_field", "language": "en"},
    "title_fr": {"type": "text_field", "language": "fr"}, 
    "title_es": {"type": "text_field", "language": "es"},
}

# Add documents with language mappings
mq.index("multilingual-index").add_documents(
    documents=[
        {
            "_id": "doc1",
            "title_en": "Brown shoes for men",
            "title_fr": "Chaussures marron pour hommes", 
            "title_es": "Zapatos marrones para hombres"
        }
    ],
    tensor_fields=["title_en"],
    mappings=language_mappings
)

Example: Searching with Language Mappings

Once you've indexed documents with language mappings, you can search them using the language parameter:

# Search in French
results = mq.index("multilingual-index").search(
    q="chaussures marron",
    search_method="LEXICAL",
    language="fr",
    searchable_attributes=["title_fr"]
)

# Hybrid search combining tensor and language-specific lexical search
results = mq.index("multilingual-index").search(
    q="brown shoes",
    search_method="HYBRID",
    language="en",
    hybrid_parameters={
        "retrievalMethod": "disjunction",
        "rankingMethod": "rrf"
    }
)
# Search in French
results = mq.index("multilingual-index").search(
    q="chaussures marron",
    search_method="LEXICAL",
    language="fr",
    searchable_attributes=["title_fr"]
)

# Hybrid search combining tensor and language-specific lexical search
results = mq.index("multilingual-index").search(
    q="brown shoes",
    search_method="HYBRID",
    language="en",
    hybrid_parameters={
        "retrievalMethod": "disjunction",
        "rankingMethod": "rrf"
    }
)