Embeddings Generator

Overview • Deployment • API Endpoints • Using AI Search

Overview

🧩 Embeddings Generator is a system-level microservice designed to power the AI Search feature in Supervisely by providing high-performance vector embeddings generation for images using CLIP (Contrastive Language-Image Pre-Training) technology.

Key features:

Instance-level service: Runs as a system container for the entire Supervisely instance.
RESTful API: Provides HTTP endpoints for embeddings generation and semantic search.
CLIP Service integration: High-quality image embeddings using state-of-the-art CLIP models.
Qdrant integration: Efficient vector database for embedding storage and retrieval.
Semantic search capabilities: Text-to-image and image-to-image search functionality.
Diverse selection: Advanced clustering algorithms for selecting diverse image subsets.
Zero-downtime operation: Runs continuously in the background as a headless service.

The service enables powerful AI-driven image analysis workflows:

Text-to-Image Search: Find images using natural language descriptions.
Image-to-Image Search: Discover visually similar images in datasets.
Hybrid Search: Combine text prompts and reference images for precise results.
Diverse Selection: Use clustering algorithms to select diverse image subsets.

Architecture

The application uses a containerized microservice architecture with RESTful API endpoints:

Containerized Service: Runs as a Docker container at the instance level.
CLIP Service: Generates high-quality embeddings using CLIP models.
Qdrant Integration: Efficiently stores and manages vector embeddings.
RESTful API: Simple HTTP endpoints for easy integration with external systems.
Background Processing: Headless service with automatic embedding management.
Multi-project Support: Handles multiple projects concurrently.

Deployment

Prerequisites

Supervisely instance with admin access.
Docker environment for container deployment.
Running CLIP as Service instance (task ID or service endpoint).
Qdrant vector database instance (URL).

Environment Variables

Configure the service using the environment variables in docker-compose.yml.

Configuration

Qdrant DB: Full URL including protocol (https/http) and port (e.g., https://192.168.1.1:6333).
CLIP Service: Task ID for CLIP as Service session or its host including port (e.g., 1234 or https://192.168.1.1:51000).

The service starts automatically on instance startup and provides API endpoints for all projects in the Supervisely instance.

Recommended: Deploy alongside the Embeddings Auto-Updater service to keep embeddings up-to-date automatically.

API Endpoints

The service provides three main API endpoints:

/embeddings - Generate or update embeddings for project images.
/search - Semantic search for similar images using text prompts or reference images.
/diverse - Select diverse images using clustering algorithms.

Using AI Search

For each project, you want to use the AI Search feature you need to enable this feature:

After enabling the AI Search feature, embeddings will be generated automatically for all images in the project, it may take some time depending on the number of images.

Once embeddings are generated, you can use the semantic search and diverse selection features:

Semantic Search

Use text prompts to find similar images in your project:

Use reference images to find visually similar images:

Select Reference Images	Results

When results are returned, you can see the confidence scores for each image, indicating how similar they are to the search query. You can adjust the slider to filter results based on confidence:

Diverse Selection

Use clustering algorithms to select diverse images from your project:

For technical support and questions, please join our Supervisely Ecosystem Slack community.

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
.vscode		.vscode
docker		docker
src		src
.flake8		.flake8
.gitignore		.gitignore
README.md		README.md
cleaner.py		cleaner.py
create_venv.sh		create_venv.sh
dev_requirements.txt		dev_requirements.txt
docker-compose.yml		docker-compose.yml
local.env		local.env
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Embeddings Generator

Overview

Architecture