Introducing Marqtune

We are delighted to introduce Marqtune, the embedding model training platform that allows you to train highly specialised, billion parameter embedding models that improve search, recommendations and RAG applications.

Why Did We Create Marqtune?

We’ve built Marqtune on the foundation of our new training framework, Generalized Contrastive Learning (GCL). With GCL, you can fine-tune embedding models to rank search results not only by semantic relevance but also by a ranking system defined by your search team. This means better, more relevant search results that cater to your business.

Marqtune has been developed in response to feedback from our customers. Every vector search system in production needs to have its models continuously retrained and updated. Doing this manually is simply not feasible. Marqtune introduces a user-friendly process for fine-tuning embedding models and allows users to achieve significant improvements in search relevance with minimal engineering effort.

What Existing Problems Does Marqtune Solve?

If you've ever dealt with search systems, you know how frustrating it can be when the results just don’t quite hit the mark. They might technically be correct, but they often miss the true intent behind what you're searching for. That's where Marqtune comes in! By using advanced embedding models fine-tuned with Generalized Contrastive Learning (GCL), you ensure that search results are not only spot-on in terms of relevance but also perfectly aligned with your business needs.

Another huge pain point for many businesses is the tedious task of maintaining and updating vector search models. Manually retraining these models can be a real headache – it’s time-consuming and requires a lot of technical know-how. Marqtune takes the hassle out of this process. With our platform, you can fine-tune embedding models with just a few lines of code, making it super easy to keep your search systems up-to-date and performing at their best.

In short, Marqtune solves these common problems by delivering superior search experiences tailored to your unique needs, without requiring extensive engineering resources. It's all about making life easier and more efficient for you and your business.

What is Fine-Tuning?

Fine-tuning is the process of training an existing machine learning model (”base-model”) on new data. The base-model is typically pre-trained on large and general corpora giving it a good all-round understanding of the corpora domain. A fine-tuned model benefits from this general understanding because it is reusing the model's parameters as a starting point for “fine-tuning” on domain specific data.

What are the Benefits to Fine-Tuning?

Fine-tuning is used to improve performance of a machine learning model on a specific domain or task. This may be necessary because the concepts to be learned are very specific to a domain, application or business. They could also be new concepts or existing concepts that need to be updated. For example, fine-tuning a CLIP model on historic query logs and product images will make it better at matching queries to product images.

How Does Fine-Tuning Work?

Fine-tuning works by taking a base-model and further training it on domain specific data. The dataset size and training parameters are typically different to what was used for training the base-model. This is because the training is meant to preserve some of the knowledge from the base-model and selecting incorrect parameters can degrade performance. For more details about the Marqtune training method, see the Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking blog post and research paper.

Can I Optimize for a Particular Outcome or Business Metric?

Yes. Marqtune can optionally use data that relates to particular outcomes. For example, this might be downloads, add-to-cart or clicks of products that came from users searching for particular search terms. This “outcome” data provides additional context during fine-tuning about what interactions were more important (think of it like a rating for a product given a particular query). This means the model will learn what products are best to show to optimize for a particular metric. Multiple outcome metrics can be used during fine-tuning.

What do I Need for Fine-Tuning?

For fine-tuning, the biggest input is a dataset. Base-model selection and training parameters are handled using sensible defaults. The dataset should consist of text and image pairs relevant to the task. The specifics are described in the Data for Fine-tuning guide.

Glossary

Fine-tuning: the process of training an existing machine learning model (”base-model”) on new data.

Base-model: a model that has been trained on data before and is used as a starting point for fine-tuning.

CLIP model: CLIP models are models that are trained on image and text pairs.

Model parameters: A “model” is made up of parameters. These parameters are arrays of floating point numbers.

Model training: This is the process of modifying (”updating”) model parameters by using new data.

Ready to Get Started with Marqtune?

Get started with Marqtune today by visiting our Quick Start Guide.