Skip to content

A FastAPI-powered RAG pipeline that answers questions about France using smart search and LLMs like OpenAI or Together AI. Easy to run, with a clean UI and built-in tools for scraping, retrieval, and benchmarking.

License

Notifications You must be signed in to change notification settings

ImRanjbar/FastAPI-RAG-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FastAPI RAG Pipeline

A Retrieval-Augmented Generation (RAG) pipeline built with FastAPI that processes queries about France, retrieves relevant text chunks, and generates comprehensive responses using LLMs.

Project Structure

.
├── src/
│   ├── data_utils.py         # Data processing, web scraping, and text chunking
│   ├── retrieval.py          # Vector embeddings and retrieval logic
│   ├── llm_service.py        # LLM integration (Together AI/OpenAI)
│   ├── main.py               # FastAPI application
│   ├── start_services.py     # Service launcher with health checks
│   ├── fix_embeddings.py     # Embeddings integrity checker
├── data/
│   ├── embeddings.pkl        # Precomputed vector embeddings
│   ├── processed_data.json   # Processed and chunked data
├── static/
│   ├── index.html            # Simple web UI
│   ├── compare.html          # Retrieval method comparison UI
│   └── benchmark.html        # Benchmarking UI
├── requirements.txt          # Project dependencies
└── Doc-Final Project.pdf     # Project documentation

Features

  • Web Scraping: Extracts information from encyclopedic sources about France
  • Text Processing: Cleans and chunks text data with semantic awareness
  • Vector Search: Uses embeddings for semantic similarity search
  • Hybrid Retrieval: Combines vector similarity with metadata filtering
  • LLM Integration: Works with Together AI or OpenAI for response generation
  • Web UI: Simple interface for querying the system and viewing results
  • Benchmarking: Advanced system for evaluating RAG pipeline performance against key metrics

Quick Start

  1. Install dependencies:

    pip install -r requirements.txt
    
  2. Set environment variables:

    # For Together AI (preferred)
    TOGETHER_API_KEY=your_api_key_here
    
    # Or for OpenAI
    OPENAI_API_KEY=your_api_key_here
    
  3. Extract text chunks and create embeddings (run these scripts before starting the service):

    python src/data_utils.py    # Extract and chunk data
    python src/retrieval.py     # Generate embeddings for chunks
    
  4. Start the service:

    python start_services.py
    
  5. Access the UI: Open your browser and visit http://localhost:8000

API Endpoints

  • GET / - Web UI interface
  • GET /health - Health check endpoint
  • POST /retrieve - Retrieve relevant chunks for a query
  • POST /generate - Generate a response using retrieved context
  • GET /stats - Get embedding statistics
  • GET /llm-status - Get LLM service status
  • POST /compare-retrieval - Compare vector vs hybrid retrieval methods
  • POST /benchmark - Run comprehensive benchmarks against the RAG pipeline

User Interface

Below are screenshots of the user interface and benchmarking tools:

Main Query Interface Main Query Interface

Retrieval Method Comparison Retrieval Method Comparison

Benchmarking Dashboard Benchmarking Dashboard

Course Information

Course: Data Mining
University: University of Isfahan
Professor: Dr. Mohammad Kiani
Semester: Spring 2025

License

[MIT License]

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

A FastAPI-powered RAG pipeline that answers questions about France using smart search and LLMs like OpenAI or Together AI. Easy to run, with a clean UI and built-in tools for scraping, retrieval, and benchmarking.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published