Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors

This is the official repository of Spotting tell-tale visual artifacts in face swapping videos: strengths and pitfalls of CNN detectors, presented at IWBF2025 and available on IEEE Xplore.

The trained models are available at the following OneDrive folder.

We made available for reserach purposes only our novel FOWS dataset. You can request access to our dataset by filling this Google form.

Setup

Install requirements

install miniconda

create the 'fows' environment with python 3.10

conda create -n fows python=3.10
conda activate fows

clone the repository and install the requirements

# clone project (NOTE: update link)
git clone https://github.com/RickyZi/FOWS_test.git

# install project   
cd FOWS_test
# activate the conda env
conda activate fows
# install the requirements
pip install -r requirements.txt

Usage

Quick run

A simple demo explaining the whole pipeline of the project is available in Colab. You can use this demo to test the pre-trained models on the FOWS dataset or on your own videos.

Dataset preprocessing

Our FOWS dataset consists in a collection of original and manipulate videos of user performing actions that occlude portion of their face. In order to train the models, we extracted the user faces from the video and organize them as 'occluded' and 'non-occluded'. For ease of reproduction, we also made available the preprocessed version of the FOWS dataset. You can access it by filling out this Google form.

You can replicate this preprocessing by using the scripts available in the ./preprocessing/ folder:

frames_and_faces_extraction.py will apply mediapipe's Blaze Face detector to detect and extract the faces from the video,
fows_dataset_processing.py will organize the images into 'occluded' and 'non-occluded' faces. Please note that a manual revision of the results may be needed in this case.

In our work we applied the same frame categorization preprocessing to the GOTCHA dataset using the ./preprocessing/gotcha_dataset_preprocessing.py script to organize occluded and non-occluded faces.

The same preprocessing applied to our FOWS dataset can be replicated to preprocess your own videos to be tested with our pre-trained models.

Model training

The code for training the models presented in the paper is provided in train.py.

You can train a specific model using the following command:

   python train.py --model mnetv2 --train_dataset fows_occ --ft --tags mnetv2_fows_occ_FT

model: defines the model backbone used for training
- MobileNetV2 (mnetv2)
- EfficientNetB4 (effnetb4)
- XceptionNet (xception)
train_dataset: the dataset used for training (fows_occ, fows_no_occ)
ft (or tl): the model training strategy
- ft: Fine-Tuning
- tl: Transfer Learning
tags: defines the name of the folder where the model weights and the training logs will be saved

Inference

The code to perform inference of the trained models on a specific test_dataset is provided in test.py.

You can test a trained model on a specific dataset with the following command:

python test.py --model mnetv2 --train_dataset fows_occ --test_dataset fows_no_occ --tl --tags MnetV2_fows_occ_TL_vs_fows_no_occ

model: name of the pre-trained model to use
train_dataset: the dataset used for training the model (fows_occ, fows_no_occ)
test_dataset: the dataset used for testing the model (fows_occ, fows_no_occ)
ft (or tl): the model training strategy
- ft: Fine-Tuning
- tl: Transfer Learning --tags: the name of the folder where the model inference results and logs will be saved

We also provide the code for computing GradCam activations for a given dataset in the gradcam.py script.

Example usage:

    python gradcam.py --model mnetv2 --train_dataset fows_occ --test_dataset fows_no_occ --ft --cam_method gradcam++ --num-layers 1 --tags mnetv2_fows_occ_FT_vs_fows_no_occ

model: name of the pre-trained model to use
train_dataset: dataset used when training the model (used to select the pre-trained model)
test_dataset: dataset used for computing GradCam activations (i.e. a random subset of the dataset)
ft (or tl): training strategy
- ft: Fine Tuning
- tl: Transfer Learning
cam_method: which gradcam method to apply (gradcam, gradcam++, eigencam, scorecam)
num_layers (1,2, or 3): how many layers to use for computing the gradcam output. One layer referes to the last convolutional layer of the model. More than one layer and the gradcam activations will be computed as the average of the layers activation.
tags: the name of the folder where to save the gradcam activations

Citation

If your research uses part of our dataset, models and code, partially or in full, please cite:

@INPROCEEDINGS{11113429,
  author={Ziglio, Riccardo and Pasquini, Cecilia and Ranise, Silvio},
  booktitle={2025 13th International Workshop on Biometrics and Forensics (IWBF)}, 
  title={Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors}, 
  year={2025},
  volume={},
  number={},
  pages={01-06},
  keywords={Biometrics;Visualization;Forensics;Soft sensors;Conferences;Detectors;Real-time systems;Data models;Faces;Videos;face swapping;face verification;remote video calls;forensic detection},
  doi={10.1109/IWBF63717.2025.11113429}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
dataset		dataset
model_weights		model_weights
notebook_demo		notebook_demo
preprocessing		preprocessing
utilscripts		utilscripts
README.md		README.md
focalLoss.py		focalLoss.py
gradcam.py		gradcam.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors

Setup

Install requirements

Usage

Quick run

Dataset preprocessing

Model training

Inference

Citation

About

Uh oh!

Releases

Packages

Languages

stfbk/FOWS

Folders and files

Latest commit

History

Repository files navigation

Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors

Setup

Install requirements

Usage

Quick run

Dataset preprocessing

Model training

Inference

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages