Skip to content

Fine-tuning with Marqtune


Fine-tuning: the process of training an existing machine learning model (”base-model”) on new data.

Base-model: a model that has been trained on data before and is used as a starting point for fine-tuning.

CLIP model: CLIP models are models that are trained on image and text pairs.

Model parameters: A “model” is made up of parameters. These parameters are arrays of floating point numbers.

Model training: This is the process of modifying (”updating”) model parameters by using new data.

What is fine-tuning?

Fine-tuning is the process of training an existing machine learning model (”base-model”) on new data. The base-model is typically pre-trained on large and general corpora giving it a good all-round understanding of the corpora domain. A fine-tuned model benefits from this general understanding because it is reusing the model's parameters as a starting point for “fine-tuning” on domain specific data.

What are the benefits to fine-tuning?

Fine-tuning is used to improve performance of a machine learning model on a specific domain or task. This may be necessary because the concepts to be learned are very specific to a domain, application or business. They could also be new concepts or existing concepts that need to be updated. For example, fine-tuning a CLIP model on historic query logs and product images will make it better at matching queries to product images.

How does fine-tuning work?

Fine-tuning works by taking a base-model and further training it on domain specific data. The dataset size and training parameters are typically different to what was used for training the base-model. This is because the training is meant to preserve some of the knowledge from the base-model and selecting incorrect parameters can degrade performance. For more details about the Marqtune training method, see the Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking blog post and research paper.

Can I optimize for a particular outcome or business metric?

Yes. Marqtune can optionally use data that relates to particular outcomes. For example, this might be downloads, add-to-cart or clicks of products that came from users searching for particular search terms. This “outcome” data provides additional context during fine-tuning about what interactions were more important (think of it like a rating for a product given a particular query). This means the model will learn what products are best to show to optimize for a particular metric. Multiple outcome metrics can be used during fine-tuning.

What do I need for fine-tuning?

For fine-tuning, the biggest input is a dataset. Base-model selection and training parameters are handled using sensible defaults. The dataset should consist of text and image pairs relevant to the task. The specifics are described in the Data for Fine-tuning guide.