Skip to content

Bring your own models

If the models in our registry do not meet your requirements, or you have a custom model that you want to use, you can bring your own model to Marqo. In this section, we will show you how to use your own OpenCLIP models and sentence transformers models in Marqo.


Bring your own OpenCLIP model

Marqo supports you to use your own OpenCLIP models fine-tune under the OpenCLIP framework. To load a custom OpenCLIP model, you need to provide the model properties in the index settings. A full details of the settings in modelProperties are listed below:

Field Name Type Default Value Description
name String No Default The name of the model. It can be the architecture (e.g., "ViT-B-32") of the model or the Hugging Face model card starting with "hf-hub:".
dimensions Integer No Default The dimension of the embeddings generated by the model.
type String No Default The type of the model. It should be "open_clip" since we are loading an OpenCLIP model here.
url String (Optional) None The URL of the model checkpoint. Cannot be provided together with "modelLocation".
modelLocation Dict (Optional) None The location of the model in S3 or Hugging Face. Cannot be provided together with "url".
jit Boolean (Optional) False A boolean indicating whether the model is JIT compiled.
precision String (Optional) "fp32" The precision of the model. It should be either "fp32" or "fp16".
tokenizer String (Optional) None The Hugging Face tokenizer to be loaded. Provide this if you want to overwrite the tokenizer inferred from name.
imagePreprocessor String (Optional) "OpenCLIP" The image preprocess configuration. Must be one of "SigLIP", "OpenAI", "OpenCLIP", "MobileCLIP", or "CLIPA".
mean List[float] (Optional) None The mean of the image preprocessor. If provided, it will overwrite the loaded configuration.
std List[float] (Optional) None The standard deviation of the image preprocessor. If provided, it will overwrite the loaded configuration.
size Integer (Optional) None The size of the image preprocessor. If provided, it will overwrite the loaded configuration.
note String (Optional) None A place to add notes to your model. This does not affect your model loading process.
pretrained String (Optional) None A place to indicate the pretrained dataset of your model. This does not affect your model loading process.

Most of the fields are optional and have default values. You can provide the fields you want to customize in the modelProperties. However, you need to provide at least the name, dimensions, and type fields to load a custom OpenCLIP model. There are two ways to load a custom OpenCLIP model in Marqo:

Load from a Hugging Face model card

To load a custom OpenCLIP model from a Hugging Face model card, you need to provide the model card name with the "hf-hub:" in the name, the dimensions of the model in dimensions, and the type of the model in type as "open_clip". Other fields are neglected in this case. This suits the case where you want to load a public model card from Hugging Face.

For example, instead of using loading the Marqo FashionCLIP model from the registry, you can load it from the Hugging Face with the following code:

settings = {
    "treatUrlsAndPointersAsImages": True,
    "model": "marqo-fashion-clip-custom-load",
    "modelProperties": {
        "name": "hf-hub:Marqo/marqo-fashionCLIP",
        "dimensions": 512,
        "type": "open_clip",
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index(
    "marqo-fashion-clip-custom-load-index", settings_dict=settings
)

Load from a checkpoint file

This is the case where you have a custom OpenCLIP model checkpoint file and you want to load it in Marqo. This has the highest flexibility as you can load any custom model you have fine-tuned, from any source, and with any configurations, as long as the architecture is supported by OpenCLIP.

You need to provide the model name in name which is the architecture of the model (e.g., "ViT-B-32", "ViT-L-16-SigLIP"), the dimensions of the model in dimensions, and the type of the model in type as "open_clip".

You have two options to provide the checkpoint file: - 1. Provide the URL of the checkpoint file in url. The url should be accessible by Marqo and link to the checkpoint file with the format of *.pt.

    1. Provide the location of the checkpoint file in S3 or Hugging Face in modelLocation. The modelLocation has the following fields:
Field Name Type Default Value Description
s3 Dict No Default A dictionary with "Bucket" and "Key" fields to locate the *.pt checkpoint
hf Dict No Default A dictionary with "repoId" and "filename" fields to locate the *.pt checkpoint
authRequired Bool False A boolean indicating whether the authentication is required.

If authentication is required, you need to provide the authentication information in when you search or add documents to the index.

You can provide other fields like jit, precision, tokenizer, imagePreprocessor, mean, std, size, note, in the modelProperties to configure your model.

Examples

Here are some examples to load a custom OpenCLIP model in Marqo. Note that if your name has the "hf-hub:" prefix, we will try to load it from Hugging Face and ignore the url and modelLocation fields. Otherwise, if you provide the url or modelLocation, we will load the model from the provided location and treat the name as the model architecture.

Example 1: Load a custom OpenCLIP model from a public URL without configurations

settings = {
    "treatUrlsAndPointersAsImages": True,
    "model": "my-own-clip-model",
    "modelProperties": {
        "name": "ViT-B-32",
        "dimensions": 512,
        "url": "https://github.com/mlfoundations/open_clip/releases/download/v0.2-weights/vit_b_32-quickgelu-laion400m_e32-46683a32.pt",
        "type": "open_clip",
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index("my-own-clip-model", settings_dict=settings)

The above code loads a custom OpenCLIP model from a public URL. Note it is the same as loading the model open_clip/ViT-B-32/lainon400m_e32 from the registry. We use the public URL of the model checkpoint as an example.

Example 2: Load a custom OpenCLIP model from a public URL with custom configurations

settings = {
    "treatUrlsAndPointersAsImages": True,
    "model": "my-own-clip-model",
    "modelProperties": {
        "name": "ViT-B-16-SigLIP",
        "dimensions": 768,
        "url": "https://huggingface.co/Marqo/marqo-fashionSigLIP/resolve/main/open_clip_pytorch_model.bin",
        "imagePreprocessor": "SigLIP",
        "type": "open_clip",
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index("my-own-clip-model", settings_dict=settings)

The above code loads a custom OpenCLIP model from a public URL with custom configurations for the image preprocessor. It is very important to provide the correct imagePreprocessor configuration to match the model architecture as Marqo can not infer the correct configuration from the model name when you load a checkpoint file and will use the default configuration("OpenCLIP"). The imagePreprocessor is set to "SigLIP" in this example to match the model architecture ViT-B-16-SigLIP.

Note this is the same as loading the Marqo FashionSigLIP model from the registry. We use the public URL of the model checkpoint as an example.

Example 3: Load a custom OpenCLIP model from a private S3 bucket with authentication

settings = {
    "treatUrlsAndPointersAsImages": True,
    "model": "my-private-clip-model",
    "modelProperties": {
        "name": "ViT-B-32",
        "dimensions": 512,
        "modelLocation": {
            "s3": {
                "Bucket": "my-prive-bucket",
                "Key": "my-private-model-checkpoint.pt",
            },
        },
        "authRequired": True,
        "type": "open_clip",
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index("my-own-clip-model", settings_dict=settings)

model_auth = {
    "s3": {
        "aws_secret_access_key": "my-secret-access-key",
        "aws_access_key_id": "my-access-key-id",
    }
}

mq.index("my-own-clip-model").search("test", model_auth=model_auth)

The above code loads a custom OpenCLIP model from a private S3 bucket with authentication. The authRequired is set to True and you need to provide the authentication information when you search or add documents to the index.

Bring your own Hugging Face Sentence Transformers models

Marqo supports you to use your own Hugging Face Sentence Transformers models to generate embeddings for your text data. You can use your own fine-tuned models to achieve better search results in domain-specific tasks. Generally, your model should follow the Hugging Face Sentence Transformers model format and can be loaded with AutoModel.from_pretrained and AutoTokenizer.from_pretrained in the transformers library.

A full details of the settings in modelProperties are listed below:

Field Name Type Default Value Description
name String (Optional) None The name of the model. This can be the huggingface repo name.
dimensions Integer No Default The dimension of the embeddings generated by the model.
type String No Default The type of the model. It must be one of hf or hf_stella.
url String (Optional) None The URL of the model checkpoint. Cannot be provided together with "modelLocation".
modelLocation Dict (Optional) None The location of the model in S3 or Hugging Face. Cannot be provided together with "url".
poolingMethod String (Optional) "mean" The pooling method to generate the sentence embeddings. It should be one of "mean", or "cls".
note String (Optional) None A place to add notes to your model. This does not affect your model loading process.
trustRemoteCode* Bool (Optional) False Whether to trust the remote code when loading the model. Set this to True if you are loading a hf_stella models
tokens Integer (Optional) 128 The maximum number of tokens to be used in the tokenizer.
text_query_prefix String (Optional) None The default prefix to be added to the text query. Note this will be overwritten by the text_query_prefix parameter in search.
text_chunk_prefix String (Optional) None The default prefix to be added to the documents chunks. Note this will be overwritten by the text_chunk_prefix parameter in add_document

* Enabling trustRemoteCode allows the model to execute code from remote sources, which can be necessary for custom functionality in hf_stella models. However, it introduces security risks, as unverified code may be executed. It is recommended to enable this flag only for trusted sources and use appropriate access controls and monitoring.

To load your model, at least one of the name, url, or modelLocation fields should be provided.

Load from a Hugging Face model card

The easiest way to load your fine-tuned model is to use the Hugging Face model card. After the fine-tuning, you can upload your model to the Hugging Face model hub and use the model card to load your model in Marqo. The model can be public or private. If the model is private, you need to provide the authentication information when you search or add documents to the index.

If the model is public, you can load it with the following code:

# Loading the model from a public Hugging Face model card:
settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "name": "<your-public-huggingface-model-card-name>",
        "dimensions": 384,  # the dimension of the embeddings generated by the model
        "type": "hf",
        "tokens": 128,  # the maximum number of tokens to be used in the tokenizer
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index("test-custom-hf-model", settings_dict=settings)

If the model is private, you can load it with the following code:

settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "modalLocation": {
            "hf": {
                "repoId": "<your-private-huggingface-model-card-name>",
            },
            "authRequired": True,
        },
        "dimensions": 384,  # the dimension of the embeddings generated by the model
        "type": "hf",
        "tokens": 128,  # the maximum number of tokens to be used in the tokenizer
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index("test-custom-hf-model", settings_dict=settings)

model_auth = {"hf": {"token": "<your-hf-token>"}}

mq.index("test-custom-hf-model").search("test", model_auth=model_auth)

Load from a zip file

You can also provide a zip file containing all the your fine-tuned model and load it in Marqo. The zip file should contain the necessary files for the model, including the model checkpoint, tokenizer, configurations, etc. Note that these files should be in the root directory of the zip file.

The zip file can be provided through 3 ways: - 1. A public URL - 2. A public/private S3 bucket - 3. A public/private Hugging Face repository

Here is the code:

# Load from a public url
settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "url": "https://path/to/your/sbert/model.zip",
        "dimensions": 384,  # the dimension of the embeddings generated by the model
        "type": "hf",
        "tokens": 128,  # the maximum number of tokens to be used in the tokenizer
    },
    "normalizeEmbeddings": True,
}

# Load from a s3 bucket
settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "modelLocation": {
            "s3": {
                "Bucket": "<your-s3-bucket-name>",
                "Key": "<your-zip-file-key.zip>",  # a zip file
            }
        },
        "dimensions": 384,  # the dimension of the embeddings generated by the model
        "type": "hf",
        "tokens": 128,  # the maximum number of tokens to be used in the tokenizer
    },
    "normalizeEmbeddings": True,
}

# Load from a hf repository
settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "modelLocation": {
            "hf": {
                "repoId": "<your-hf-repo-name>",
                "filename": "<your-zip-file-key.zip>",  # a zip file
            }
        },
        "dimensions": 384,  # the dimension of the embeddings generated by the model
        "type": "hf",
        "tokens": 128,  # the maximum number of tokens to be used in the tokenizer
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index("test-custom-hf-model", settings_dict=settings)

Stella Models

Marqo also supports loading the Stella models. The dunzhang/stella_en_400M_v5 is included in the model registry, and you can load it by checking the guides here.

If you want to load your fine-tuned Stella model, you can use the same loading methods above with the type set to "hf_stella", and set the trustRemoteCode to True. The Stella models are trained by MRL so they can have multiple dimensions. However, Marqo requires you to concatenate the linear layers within the model. Your model must provide end-to-end embeddings with a single dimension.

Examples

Here are some examples to load your own Hugging Face Sentence Transformers model in Marqo.

Example 1: Load a model from a public Hugging Face model card

The sentence-transformers/nli-bert-base-cls-pooling is not included in the model registry. You can still load it from the Hugging Face model card with the following code:

settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "name": "sentence-transformers/nli-bert-base-cls-pooling",
        "dimensions": 768,
        "type": "hf",
    },
    "normalizeEmbeddings": True,
}

Example 2: Load a private model from a Hugging Face model card

Here, we load a private model from the Hugging Face model card with the following code:

settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "modelLocation": {
            "hf": {
                # This is a private model that you don't have access to
                "repoId": "Marqo/e5-base-v2-private-test",
            },
            "authRequired": True,
        },
        "dimensions": 768,
        "type": "hf",
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index("test-custom-hf-model", settings_dict=settings)

model_auth = {"hf": {"token": "<your-hf-token>"}}

res = mq.index("test-custom-hf-model").search("test", model_auth=model_auth)

Example 3: Load a model from a public URL

Here, we load a model from a public URL to a zip file with the following code:

settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "url": "https://marqo-ecs-50-audio-test-dataset.s3.us-east-1.amazonaws.com/test-hf.zip",  # a public URL
        "dimensions": 384,
        "type": "hf",
    },
    "normalizeEmbeddings": True,
}

response = mq.create_index("test-custom-hf-model", settings_dict=settings)

Example 4: Load a Stella model from a public URL

Here, we load a Stella model from a public URL to a zip file with the following code. Note that this feature is not available for cloud users for security reasons.

settings = {
    "model": "your-own-sentence-transformers-model",
    "modelProperties": {
        "url": "https//:a-public-url-to-your-stella-model/model.zip",
        "dimensions": 1024,
        "type": "hf_stella",
        "trustRemoteCode": True,
    },
}

response = mq.create_index("test-custom-hf-model", settings_dict=settings)

Preloading your model

There may be cases wherein you want to preload (or prewarm, in other terms) your model before using it to index. This can be done by adding your model (with model and modelProperties) to the list of models on startup in your Marqo configuration.

The syntax for this can be found in Configuring preloaded models