A Retrieval-Augmented Generation (RAG) pipeline built with FastAPI that processes queries about France, retrieves relevant text chunks, and generates comprehensive responses using LLMs.
.
├── src/
│ ├── data_utils.py # Data processing, web scraping, and text chunking
│ ├── retrieval.py # Vector embeddings and retrieval logic
│ ├── llm_service.py # LLM integration (Together AI/OpenAI)
│ ├── main.py # FastAPI application
│ ├── start_services.py # Service launcher with health checks
│ ├── fix_embeddings.py # Embeddings integrity checker
├── data/
│ ├── embeddings.pkl # Precomputed vector embeddings
│ ├── processed_data.json # Processed and chunked data
├── static/
│ ├── index.html # Simple web UI
│ ├── compare.html # Retrieval method comparison UI
│ └── benchmark.html # Benchmarking UI
├── requirements.txt # Project dependencies
└── Doc-Final Project.pdf # Project documentation
- Web Scraping: Extracts information from encyclopedic sources about France
- Text Processing: Cleans and chunks text data with semantic awareness
- Vector Search: Uses embeddings for semantic similarity search
- Hybrid Retrieval: Combines vector similarity with metadata filtering
- LLM Integration: Works with Together AI or OpenAI for response generation
- Web UI: Simple interface for querying the system and viewing results
- Benchmarking: Advanced system for evaluating RAG pipeline performance against key metrics
-
Install dependencies:
pip install -r requirements.txt
-
Set environment variables:
# For Together AI (preferred) TOGETHER_API_KEY=your_api_key_here # Or for OpenAI OPENAI_API_KEY=your_api_key_here
-
Extract text chunks and create embeddings (run these scripts before starting the service):
python src/data_utils.py # Extract and chunk data python src/retrieval.py # Generate embeddings for chunks
-
Start the service:
python start_services.py
-
Access the UI: Open your browser and visit http://localhost:8000
- GET / - Web UI interface
- GET /health - Health check endpoint
- POST /retrieve - Retrieve relevant chunks for a query
- POST /generate - Generate a response using retrieved context
- GET /stats - Get embedding statistics
- GET /llm-status - Get LLM service status
- POST /compare-retrieval - Compare vector vs hybrid retrieval methods
- POST /benchmark - Run comprehensive benchmarks against the RAG pipeline
Below are screenshots of the user interface and benchmarking tools:
Course: Data Mining
University: University of Isfahan
Professor: Dr. Mohammad Kiani
Semester: Spring 2025
[MIT License]
Contributions are welcome! Please feel free to submit a Pull Request.