Tracing Bias for Fairer Content-Based Misinformation Detection

Developed with 💙 at Expert.ai Research Lab

License: ISC
Paper: ACM Digital Library

Content

This repository contains the code and resources for the paper Tracing Bias for Fairer Content Based Misinformation Detection accepted at BeyondFacts 2025: 5th International Workshop on Computational Methods for Online Discourse Analysis @ TheWebConf 2025.

The work investigates bias tracing in AI systems for content-based misinformation detection. It introduces a hybrid approach that integrates semantic modeling (using boxology design patterns) with deep learning to identify and mitigate biases originating from input data. Using fine-tuned language models, the work evaluates fairness across demographic axes (e.g., gender, nationality) by applying bias detection and mitigation techniques.

The findings reveal that widely used datasets exhibit significant demographic biases. However, models trained on demographically transformed data demonstrate improved fairness. These results highlight the importance of curated, diverse datasets and systematic bias management strategies to achieve fairer content-based misinformation detection.

Repository Structure

├── data/                # Datasets used in the project
├── heterogeneity/       # Code to study demographic distributions across demographic axes
├── models/              # Trained models (populated after training)
├── notebooks/           # Jupyter notebooks to run experiments
│   ├── analyse/         # Data statistics and fairness metrics
│   ├── evaluate/        # Evaluation along demographic axes
│   ├── process/         # Data processing, including perturbation
│   └── train/           # Train models across demographic axes

Installation and Reproducibility

To run the experiments, first create a Conda environment with:

conda create -n ENV_NAME python=3.9

Next, activate the environment:

conda activate ENV_NAME

Then, install the required dependencies with:

pip install -r requirements.txt

Finally, open and execute the notebooks in the notebooks folder.

How to cite

To cite this research please use the following:

@inproceedings{10.1145/3701716.3717534,
    author = {Russo, Mayra and Merenda, Flavio and Gomez-Perez, Jose Manuel and Vidal, Maria-Esther},
    title = {Tracing Bias for Fairer Content-Based Misinformation Detection},
    year = {2025},
    isbn = {9798400713316},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3701716.3717534},
    doi = {10.1145/3701716.3717534},
    abstract = {Despite the benefits attributed to AI systems, their deployment across domains still present challenges to society. In the case of automated misinformation detection, research has uncovered that benefits derived from their application are unequally distributed amongst different stakeholders, calling to attention the need to audit these AI systems for biases and other sources of harm. We present a hybrid AI system designed to trace biases from input data, enriched with semantic descriptions. Using boxology design patterns, we illustrate the integration of a semantic model with an AI system to enable bias tracing. In our case study, we assess fine-tuned language models for content-based misinformation detection, and adapt existing bias detection and mitigation techniques to transform data based on demographic signifiers and measure model fairness. Our findings show evidence that, on average, the evaluated datasets demonstrate a stark gender and geographical biases. Further, we observe that models trained on demographically transformed data demonstrate higher fairness. These results underscore the importance of curated and diverse data and of managing biases plaguing language models at task level.},
    booktitle = {Companion Proceedings of the ACM on Web Conference 2025},
    pages = {2670–2679},
    numpages = {10},
    keywords = {automated misinformation detection, bias, model robustness, nlp},
    location = {Sydney NSW, Australia},
    series = {WWW '25}
}

Expert.ai

At Expert.ai we turn language into data so humans can make better decisions. Take a look here!

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
heterogeneity		heterogeneity
models		models
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tracing Bias for Fairer Content-Based Misinformation Detection

Content

Repository Structure

Installation and Reproducibility

How to cite

Expert.ai

About

Uh oh!

Releases

Packages

Languages

License

expertailab/Tracing-Bias-for-Fairer-Content-Based-Misinformation-Detection

Folders and files

Latest commit

History

Repository files navigation

Tracing Bias for Fairer Content-Based Misinformation Detection

Content

Repository Structure

Installation and Reproducibility

How to cite

Expert.ai

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages