π§ Fine-tuned RoBERTa-based Multi-Modal Fake News Detector with Explanation Generation using FLAN-T5, URL/PDF/Text support, and Agentic LangGraph orchestration. Orchestrated through a LangGraph-powered agentic pipeline with Planner, Retriever, Tool Router, Fallback Agent, and LLM Answerer agents, plus memory and dynamic tool augmentation.
demo.mp4
π₯οΈ Try it now: InformaTruth β Fake News Detection AI App
In the digital age, misinformation spreads rapidly across news outlets, social media, and online platforms. With the increasing difficulty of distinguishing between credible journalism and deceptive content, This agentic AI system detects fake news from text, PDF, or website URLs using a fine-tuned RoBERTa model. It leverages a multi-agent architecture with LangGraph, including Planner, Retriever, Tool Router, and Explanation Agent. When a claim is classified, the system uses FLAN-T5 to generate human-readable reasoning. If local evidence fails, it falls back on Wikipedia or DuckDuckGo search. This production-grade solution supports real-world fact-checking, multi-source ingestion, tool-augmented reasoning, and modular orchestration.
Category | Technology/Resource |
---|---|
Core Framework | PyTorch, Transformers, HuggingFace |
Classification Model | Fine-tuned RoBERTa-base on LIAR Dataset |
Explanation Model | FLAN-T5-base (Zero-shot Prompting) |
Training Data | LIAR Dataset (Political Fact-Checking) |
Evaluation Metrics | Accuracy, Precision, Recall, F1-score |
Training Framework | HuggingFace Trainer |
LangGraph Orchestration | LangGraph (Multi-Agent Directed Acyclic Execution Graph) |
Agents Used | PlannerAgent, InputHandlerAgent, ToolRouterAgent, ExecutorAgent, ExplanationAgent, FallbackSearchAgent |
Input Modalities | Raw Text, Website URLs (via Newspaper3k), PDF Documents (via PyMuPDF) |
Tool Augmentation | DuckDuckGo Search API (Fallback), Wikipedia (Planned), ToolRouter Logic |
Web Scraping | Newspaper3k (HTML β Clean Article) |
PDF Parsing | PyMuPDF |
Explainability | Natural language justification generated using FLAN-T5 |
State Management | Shared State Object (LangGraph-compatible) |
Deployment Interface | Flask (HTML,CSS,JS) |
Hosting Platform | Render (Docker) |
Version Control | Git, GitHub |
Logging & Debugging | Logs, Print Debugs, Custom Logger |
Input Support | Text, URLs, PDF documents |
-
π Multi-Format Input Support Accepts raw text, web URLs, and PDF documents with automated preprocessing for each type.
-
π§ Full NLP Pipeline Integrates summarization (optional), fake news classification (RoBERTa), and natural language explanation (FLAN-T5).
-
π§± Modular Agent-Based Architecture Built using LangGraph with modular agents:
Planner
,Tool Router
,Executor
,Explanation Agent
, andFallback Agent
. -
π Explanation Generation Uses FLAN-T5 to generate human-readable, zero-shot rationales for model predictions.
-
π§ͺ Tool-Augmented & Fallback Logic Dynamically queries DuckDuckGo when local context is insufficient, enabling robust fallback handling.
-
π§Ό Clean, Modular Codebase with Logging Structured using clean architecture principles, agent separation, and informative logging.
-
π Flask with Web UI User-friendly, interactive, and responsive frontend for input, output, and visual explanations.
-
π³ Dockerized for Deployment Fully containerized setup with
Dockerfile
andrequirements.txt
for seamless deployment. -
βοΈ CI/CD with GitHub Actions Automated pipelines for testing, linting, and Docker build validation to ensure code quality and production-readiness.
InformaTruth/
β
βββ .github/ # GitHub Actions
β βββ workflows/
β βββ main.yml
β
βββ agents/ # Modular agents (planner, executor, etc.)
β βββ executor.py
β βββ fallback_search.py
β βββ input_handler.py
β βββ planner.py
β βββ router.py
β βββ __init__.py
β
βββ fine_tuned_liar_detector/ # Fine-tuned RoBERTa model directory
β βββ config.json
β βββ vocab.json
β βββ tokenizer_config.json
β βββ special_tokens_map.json
β βββ model.safetensors
β βββ merges.txt
β
βββ graph/ # LangGraph state and builder logic
β βββ builder.py
β βββ state.py
β βββ __init__.py
β
βββ models/ # Classification + LLM model loader
β βββ classifier.py
β βββ loader.py
β βββ __init__.py
β
βββ news/ # Sample news or test input
β βββ news.pdf
β
βββ notebook/ # Jupyter notebooks for experimentation
β βββ 1 Fine-Tuning.ipynb
β βββ 2 Fine-Tuning with Multi Agent.ipynb
β
βββ static/ # Static files (CSS, JS)
β βββ css/
β β βββ style.css
β βββ js/
β βββ script.js
β
βββ templates/ # HTML templates for Flask UI
β βββ dj_base.html
β βββ dj_index.html
β
βββ tests/ # Unit tests
β βββ test_app.py
β
βββ train/ # Training logic
β βββ config.py
β βββ data_loader.py
β βββ predictor.py
β βββ run.py
β βββ trainer.py
β βββ __init__.py
β
βββ utils/ # Utilities like logging, evaluation
β βββ logger.py
β βββ results.py
β βββ __init__.py
β
βββ __init__.py
βββ app.png # Demo
βββ demo.webm # Demo video
βββ app.py # Flask app entry point
βββ main.py # Main script / orchestrator
βββ config.py # Configuratin file
βββ setup.py # Project setup for pip install
βββ render.yaml # Project setup render
βββ Dockerfile # Docker container spec
βββ requirements.txt # Python dependencies
βββ LICENSE # License file
βββ .gitignore # Git ignore rules
βββ .gitattributes # Git lfs rules
βββ README.md # Readme
graph TD
A[User Input] --> B{Input Type}
B -->|Text| C[Direct Text Processing]
B -->|URL| D[Newspaper3k Parser]
B -->|PDF| E[PyMuPDF Parser]
C --> F[Text Cleaner]
D --> F
E --> F
F --> G[Context Validator]
G -->|Sufficient Context| H[RoBERTa Classifier]
G -->|Insufficient Context| I[Web Search Agent]
I --> J[Context Aggregator]
J --> H
H --> K[FLAN-T5 Explanation Generator]
K --> L[Output Formatter]
L --> M[Web UI using Flask,HTML,CSS,JS]
style M fill:#e3f2fd,stroke:#90caf9
style G fill:#fff9c4,stroke:#fbc02d
style I fill:#fbe9e7,stroke:#ff8a65
style H fill:#f1f8e9,stroke:#aed581
Epoch | Train Loss | Val Loss | Accuracy | F1 | Precision | Recall |
---|---|---|---|---|---|---|
1 | 0.3889 | 0.6674 | 0.7204 | 0.8285 | 0.7461 | 0.9313 |
2 | 0.4523 | 0.6771 | 0.7196 | 0.8259 | 0.7511 | 0.9173 |
Emphasis on Recall ensures the model catches most fake news cases.
docker build -t informa-truth-app .
docker run -p 8501:8501 informa-truth-app
The CI/CD pipeline automates code checks, Docker image building, and Streamlit app validation.
name: CI Pipeline
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install flake8 pytest
- name: Run tests
run: pytest tests/
- name: Docker build
run: docker build -t informa-truth-app .
- Journalists and media watchdogs
- Educators and students
- Concerned citizens and digital media consumers
- Social media platforms for content moderation
Md Emon Hasan
π§ iconicemon01@gmail.com
π GitHub
π LinkedIn
π Facebook
π WhatsApp