Image Search Demo with Marqo
This guide will walk you through setting up your environment and using Marqo to perform image searches. Follow along for a step-by-step tutorial.
Getting Started
Before diving into the code, let's set up your environment to run Marqo.
Step 1: Clone the Repository
Start by cloning the examples repository to get the necessary data and scripts.
git clone --branch 2.0.0 https://github.com/marqo-ai/marqo.git
cd marqo/examples/ImageSearchGuide
Step 2: Run Marqo
Next, let's get Marqo up and running using Docker:
docker rm -f marqo
docker pull marqoai/marqo:2.0.0
docker run --name marqo -it -p 8882:8882 --add-host host.docker.internal:host-gateway marqoai/marqo:2.0.0
For more detailed instructions, refer to the getting started guide.
Step 3: Download Data
Make sure to download the data for the image search demo:
# Download the data to a local directory named 'data'
# Data available at: https://github.com/marqo-ai/marqo/tree/2.0.0/examples/ImageSearchGuide/data
With the setup out of the way, let's move on to the code.
Code Walkthrough
This section breaks down the code into manageable steps, guiding you through the image search process using Marqo.
Initialize Marqo Client
First, import Marqo and initialize the client to interact with your local Marqo instance.
import marqo
mq = marqo.Client("http://localhost:8882")
Step 1: Create an Index
Define the settings for your image index and create it using Marqo.
index_name = "image-search-guide"
settings = {
"model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
"treatUrlsAndPointersAsImages": True,
}
mq.create_index(index_name, settings_dict=settings)
Step 2: Access Local Images
Start a simple HTTP server to serve your local images and prepare them for indexing.
import subprocess
local_dir = "./data/"
subprocess.Popen(
["python3", "-m", "http.server", "8222", "--directory", local_dir],
stdout=subprocess.DEVNULL,
stderr=subprocess.STDOUT,
)
Step 3: Prepare Images for Indexing
Locate your images, generate their URLs, and prepare the document format for indexing.
import glob
import os
# Find all the local images
locators = glob.glob(local_dir + "*.jpg")
# Generate URLs for local images
docker_path = "http://host.docker.internal:8222/"
image_urls = [docker_path + os.path.basename(f) for f in locators]
Step 4: Add Images to the Index
Add your images to the Marqo index with the generated URLs.
documents = [
{"image_url": image, "_id": str(idx)} for idx, image in enumerate(image_urls)
]
mq.index(index_name).add_documents(documents, tensor_fields=["image_url"])
Step 5: Perform a Search
Use Marqo to search for an image by describing what you're looking for in natural language.
search_query = "A rider on a horse jumping over the barrier"
search_results = mq.index(index_name).search(search_query, limit=1)
Step 6: Visualize the Results
Retrieve the most relevant image from the search results and display it.
from PIL import Image
from IPython.display import display
# Get the path to the image file
fig_path = search_results["hits"][0]["image_url"].replace(docker_path, local_dir)
# Display the image
display(Image.open(fig_path))
Full Code
Example
import marqo
from pprint import pprint
mq = marqo.Client("http://localhost:8882")
####################################################
### STEP 1: Download Data
####################################################
# Download the data from [here](https://github.com/marqo-ai/marqo/tree/2.0.0/examples/ImageSearchGuide/data)
# store it in a data/ directory
#####################################################
### STEP 2. Start Marqo
#####################################################
# Follow the instructions here https://github.com/marqo-ai/marqo/tree/2.0.0
"""
docker rm -f marqo
docker pull marqoai/marqo:2.0.0
docker run --name marqo -it -p 8882:8882 --add-host host.docker.internal:host-gateway marqoai/marqo:2.0.0
"""
####################################################
### STEP 3: Index Data
####################################################
index_name = 'image-search-guide'
try:
mq.index(index_name).delete()
except:
pass
settings = {
"model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
"treatUrlsAndPointersAsImages": True,
}
mq.create_index(index_name, settings_dict=settings)
####################################################
### STEP 4: Access Local Images
####################################################
import subprocess
local_dir = "./data/"
pid = subprocess.Popen(['python3', '-m', 'http.server', '8222', '--directory', local_dir], stdout=subprocess.DEVNULL,
stderr=subprocess.STDOUT)
import glob
import os
# Find all the local images
locators = glob.glob(local_dir + '*.jpg')
# Generate docker path for local images
docker_path = "http://host.docker.internal:8222/"
image_docker = [docker_path + os.path.basename(f) for f in locators]
print(image_docker)
"""
output:
['http://host.docker.internal:8222/image4.jpg',
'http://host.docker.internal:8222/image1.jpg',
'http://host.docker.internal:8222/image3.jpg',
'http://host.docker.internal:8222/image0.jpg',
'http://host.docker.internal:8222/image2.jpg']
"""
####################################################
### STEP 5: Add Images to the Index
####################################################
documents = [{"image_docker": image, "_id": str(idx)} for idx, image in enumerate(image_docker)]
print(documents)
res = mq.index(index_name).add_documents(
documents, client_batch_size=1,
tensor_fields=["image_docker"]
)
pprint(res)
####################################################
### STEP 6: Search using Marqo
####################################################
search_results = mq.index(index_name).search("A rider on a horse jumping over the barrier")
print(search_results)
####################################################
### STEP 7: Visualize the Output
####################################################
import requests
from PIL import Image
from IPython.display import display
fig_path = search_results["hits"][0]["image_docker"].replace(docker_path, local_dir)
display(Image.open(fig_path))