Implementing Typeahead Search

Typeahead (also known as search-as-you-type or autocomplete) helps users find what they're looking for by providing intelligent query suggestions as they type. This reduces zero-result searches, speeds up the search process, and improves user experience.

What is Typeahead?

Typeahead provides real-time query suggestions based on partial user input. As users type, the system returns a list of suggested complete queries that match their input, helping them:

Find relevant queries faster
Discover popular search terms
Avoid typos and spelling mistakes
Navigate to high-intent content quickly

Marqo's typeahead uses a two-stage process:

Retrieval: Match indexed queries using prefix matching and fuzzy search
Ranking: Score suggestions using popularity and relevance (BM25) to surface the best matches

Setting Up Typeahead

Prerequisites

Typeahead is supported with Marqo 2.24.0 and later versions. You'll need an existing Marqo index - typeahead works with any index type and is supported out of the box with no extra setup required during index creation.

Indexing Popular Queries

The first step is building your suggestion corpus by indexing popular search queries. These typically come from:

Search logs and analytics
Popular product names
Common search patterns
Trending terms

Indexing Queries from Search Logs

Marqo Open SourceMarqo Cloud

curl -XPOST 'http://localhost:8882/indexes/ecommerce-products/suggestions/queries' \
  -H 'Content-type:application/json' -d '
{
  "queries": [
    {
      "query": "wireless bluetooth headphones",
      "popularity": 245.0,
      "metadata": {
        "hitCount": 245
      }
    },
    {
      "query": "iphone 15 case clear",
      "popularity": 189.0,
      "metadata": {
        "hitCount": 189
      }
    },
    {
      "query": "running shoes nike air",
      "popularity": 156.0,
      "metadata": {
        "hitCount": 156
      }
    },
    {
      "query": "coffee machine espresso",
      "popularity": 134.0,
      "metadata": {
        "hitCount": 134
      }
    }
  ]
}'

For Marqo Cloud, you will need to access the endpoint of your index and replace your_endpoint with this. You will also need your API Key.

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions/queries' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "queries": [
    {
      "query": "wireless bluetooth headphones",
      "popularity": 245.0,
      "metadata": {
        "hitCount": 245
      }
    },
    {
      "query": "iphone 15 case clear",
      "popularity": 189.0,
      "metadata": {
        "hitCount": 189
      }
    },
    {
      "query": "running shoes nike air",
      "popularity": 156.0,
      "metadata": {
        "hitCount": 156
      }
    },
    {
      "query": "coffee machine espresso",
      "popularity": 134.0,
      "metadata": {
        "hitCount": 134
      }
    }
  ]
}'

Batch Processing Tips

When indexing large numbers of queries:

Process in batches of up to 128 queries per request (default limit)
Monitor the response for any errors in the errors array
Consider implementing retry logic for failed requests

Implementing Search-as-you-type

Frontend HTML Page

Create a simple HTML page with typeahead functionality. This creates a search interface where typing displays a dropdown list of suggestions below the input field:

Typeahead Interface

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Typeahead Search</title>
    <style>
        .search-container {
            width: 400px;
            margin: 50px auto;
            position: relative;
        }

        #search-input {
            width: 100%;
            padding: 10px;
            font-size: 16px;
            border: 2px solid #ddd;
            border-radius: 4px;
        }

        #suggestions {
            position: absolute;
            top: 100%;
            left: 0;
            right: 0;
            background: white;
            border: 1px solid #ddd;
            border-top: none;
            max-height: 200px;
            overflow-y: auto;
            z-index: 1000;
        }

        .suggestion-item {
            padding: 10px;
            cursor: pointer;
            border-bottom: 1px solid #eee;
        }

        .suggestion-item:hover {
            background-color: #f0f0f0;
        }
    </style>
</head>
<body>
    <div class="search-container">
        <input type="text" id="search-input" placeholder="Search for products..." />
        <div id="suggestions"></div>
    </div>

    <script>
        let debounceTimeout;

        document.getElementById('search-input').addEventListener('input', function(e) {
            const query = e.target.value.trim();
            const suggestionsDiv = document.getElementById('suggestions');

            clearTimeout(debounceTimeout);

            debounceTimeout = setTimeout(async () => {
                if (query.length >= 2) {
                    try {
                        const response = await fetch('/api/suggestions', {
                            method: 'POST',
                            headers: {
                                'Content-Type': 'application/json'
                            },
                            body: JSON.stringify({
                                q: query,
                                limit: 8,
                                fuzzyEditDistance: 1,
                                minFuzzyMatchLength: 10,
                                popularityWeight: 0.4,
                                bm25Weight: 0.6
                            })
                        });

                        const data = await response.json();
                        displaySuggestions(data.suggestions || [], suggestionsDiv);
                    } catch (error) {
                        console.error('Error fetching suggestions:', error);
                        suggestionsDiv.innerHTML = '';
                    }
                } else {
                    suggestionsDiv.innerHTML = '';
                }
            }, 200);
        });

        function displaySuggestions(suggestions, container) {
            container.innerHTML = '';

            suggestions.forEach(suggestion => {
                const div = document.createElement('div');
                div.className = 'suggestion-item';
                div.textContent = suggestion.suggestion;

                div.addEventListener('click', () => {
                    document.getElementById('search-input').value = suggestion.suggestion;
                    container.innerHTML = '';
                    // Redirect to search results
                    window.location.href = `/search?q=${encodeURIComponent(suggestion.suggestion)}`;
                });

                container.appendChild(div);
            });
        }
    </script>
</body>
</html>

Python Proxy Server

Create a Python Flask server that proxies requests to Marqo Cloud:

from flask import Flask, request, jsonify, render_template_string
import requests
import os

app = Flask(__name__)

# Configuration
MARQO_ENDPOINT = "your_endpoint"  # Replace with your Marqo Cloud endpoint
MARQO_API_KEY = "your_api_key"  # Replace with your API key
INDEX_NAME = "ecommerce-products"


@app.route("/")
def index():
    # Serve the HTML page (you can also serve as a static file)
    with open("index.html", "r") as f:
        return f.read()


@app.route("/api/suggestions", methods=["POST"])
def get_suggestions():
    data = request.get_json()
    query = data.get("q", "").strip()

    if len(query) < 2:
        return jsonify({"suggestions": []})

    # Prepare request to Marqo Cloud
    marqo_url = f"{MARQO_ENDPOINT}/indexes/{INDEX_NAME}/suggestions"
    headers = {"x-api-key": MARQO_API_KEY, "Content-Type": "application/json"}

    payload = {
        "q": query,
        "limit": data.get("limit", 8),
        "fuzzyEditDistance": data.get("fuzzyEditDistance", 2),
        "popularityWeight": data.get("popularityWeight", 0.4),
        "bm25Weight": data.get("bm25Weight", 0.6),
    }

    try:
        # Make request to Marqo Cloud
        response = requests.post(marqo_url, json=payload, headers=headers)
        response.raise_for_status()

        return jsonify(response.json())

    except requests.RequestException as e:
        app.logger.error(f"Error calling Marqo API: {str(e)}")
        return jsonify({"suggestions": []}), 500


@app.route("/search")
def search():
    query = request.args.get("q", "")
    # Implement your search results page here
    return f"<h1>Search Results for: {query}</h1><p>Implement your search results here.</p>"


if __name__ == "__main__":
    app.run(debug=True)

Tuning Suggestions

Adjusting Fuzzy Matching

Fuzzy matching helps handle typos and variations. Tune these parameters based on your use case:

Strict MatchingLenient Matching

Good for technical terms, model numbers:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "iphn",
  "limit": 8,
  "fuzzyEditDistance": 1,
  "minFuzzyMatchLength": 4,
  "popularityWeight": 0.4,
  "bm25Weight": 0.6
}'

Good for general search, brand names:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "bluetoth",
  "limit": 8,
  "fuzzyEditDistance": 3,
  "minFuzzyMatchLength": 2,
  "popularityWeight": 0.4,
  "bm25Weight": 0.6
}'

Balancing Popularity and Relevance

The scoring system combines popularity (how often queries are searched) with BM25 relevance (how well the input matches the query text):

Popularity-FocusedRelevance-FocusedBalanced Approach

Great for trending topics, seasonal items:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "headphones",
  "limit": 8,
  "fuzzyEditDistance": 2,
  "popularityWeight": 0.8,
  "bm25Weight": 0.2
}'

Better for precise matching, technical searches:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "headphones",
  "limit": 8,
  "fuzzyEditDistance": 2,
  "popularityWeight": 0.2,
  "bm25Weight": 0.8
}'

Good general-purpose setting:

curl -XPOST 'your_endpoint/indexes/ecommerce-products/suggestions' \
-H 'x-api-key: XXXXXXXXXXXXX' \
-H 'Content-type:application/json' -d '
{
  "q": "headphones",
  "limit": 8,
  "fuzzyEditDistance": 2,
  "popularityWeight": 0.4,
  "bm25Weight": 0.6
}'

Best Practices

Query Management

Index popular queries from search logs regularly
Update popularity scores based on actual usage patterns
Remove outdated or irrelevant queries periodically
Add seasonal/trending queries proactively
Process query updates in batches of up to 128 queries

Configuration

Start with balanced weights (0.4 popularity, 0.6 BM25)
Use fuzzy distance of 1-2 for most use cases
Set min fuzzy length to 3+ for better precision
Limit suggestions to 5-10 for optimal UX

Performance Optimization

Implement caching strategies for frequently requested queries
Use debouncing on frontend (200-300ms)
Consider async API calls for better performance
Monitor response times and adjust accordingly

User Experience

Show suggestions after 2+ characters
Highlight matching text in suggestions
Provide keyboard navigation (up/down arrows)
Handle empty states gracefully

Monitoring and Analytics

Track which suggestions users actually select
Monitor suggestion quality metrics and effectiveness
Analyze user behavior patterns to improve suggestion relevance
Set up alerts for performance degradation

Typeahead is a powerful tool for improving search experience. By following these practices, implementing proper monitoring, and regularly analyzing user behavior, you can create a suggestion system that truly helps users find what they're looking for faster and more effectively.