Skip to content
/ FOWS Public

Original implementation of our IWBF 2025 paper: Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors

Notifications You must be signed in to change notification settings

stfbk/FOWS

Repository files navigation

Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors

IEEE Paper

Python PyTorch

This is the official repository of Spotting tell-tale visual artifacts in face swapping videos: strengths and pitfalls of CNN detectors, presented at IWBF2025 and available on IEEE Xplore.

The trained models are available at the following OneDrive folder.

We made available for reserach purposes only our novel FOWS dataset. You can request access to our dataset by filling this Google form.

Setup

Install requirements

  • install miniconda
  • create the 'fows' environment with python 3.10
    conda create -n fows python=3.10
    conda activate fows
  • clone the repository and install the requirements
    # clone project (NOTE: update link)
    git clone https://github.com/RickyZi/FOWS_test.git
    
    # install project   
    cd FOWS_test
    # activate the conda env
    conda activate fows
    # install the requirements
    pip install -r requirements.txt

Usage

Quick run

A simple demo explaining the whole pipeline of the project is available in Colab. You can use this demo to test the pre-trained models on the FOWS dataset or on your own videos.

FOWS demo notebook

Dataset preprocessing

Our FOWS dataset consists in a collection of original and manipulate videos of user performing actions that occlude portion of their face. In order to train the models, we extracted the user faces from the video and organize them as 'occluded' and 'non-occluded'. For ease of reproduction, we also made available the preprocessed version of the FOWS dataset. You can access it by filling out this Google form.

You can replicate this preprocessing by using the scripts available in the ./preprocessing/ folder:

  • frames_and_faces_extraction.py will apply mediapipe's Blaze Face detector to detect and extract the faces from the video,
  • fows_dataset_processing.py will organize the images into 'occluded' and 'non-occluded' faces. Please note that a manual revision of the results may be needed in this case.

In our work we applied the same frame categorization preprocessing to the GOTCHA dataset using the ./preprocessing/gotcha_dataset_preprocessing.py script to organize occluded and non-occluded faces.

The same preprocessing applied to our FOWS dataset can be replicated to preprocess your own videos to be tested with our pre-trained models.

Model training

The code for training the models presented in the paper is provided in train.py.

You can train a specific model using the following command:

   python train.py --model mnetv2 --train_dataset fows_occ --ft --tags mnetv2_fows_occ_FT
  • model: defines the model backbone used for training
    • MobileNetV2 (mnetv2)
    • EfficientNetB4 (effnetb4)
    • XceptionNet (xception)
  • train_dataset: the dataset used for training (fows_occ, fows_no_occ)
  • ft (or tl): the model training strategy
    • ft: Fine-Tuning
    • tl: Transfer Learning
  • tags: defines the name of the folder where the model weights and the training logs will be saved

Inference

The code to perform inference of the trained models on a specific test_dataset is provided in test.py.

You can test a trained model on a specific dataset with the following command:

python test.py --model mnetv2 --train_dataset fows_occ --test_dataset fows_no_occ --tl --tags MnetV2_fows_occ_TL_vs_fows_no_occ
  • model: name of the pre-trained model to use
  • train_dataset: the dataset used for training the model (fows_occ, fows_no_occ)
  • test_dataset: the dataset used for testing the model (fows_occ, fows_no_occ)
  • ft (or tl): the model training strategy
    • ft: Fine-Tuning
    • tl: Transfer Learning --tags: the name of the folder where the model inference results and logs will be saved

We also provide the code for computing GradCam activations for a given dataset in the gradcam.py script.

Example usage:

    python gradcam.py --model mnetv2 --train_dataset fows_occ --test_dataset fows_no_occ --ft --cam_method gradcam++ --num-layers 1 --tags mnetv2_fows_occ_FT_vs_fows_no_occ
  • model: name of the pre-trained model to use
  • train_dataset: dataset used when training the model (used to select the pre-trained model)
  • test_dataset: dataset used for computing GradCam activations (i.e. a random subset of the dataset)
  • ft (or tl): training strategy
    • ft: Fine Tuning
    • tl: Transfer Learning
  • cam_method: which gradcam method to apply (gradcam, gradcam++, eigencam, scorecam)
  • num_layers (1,2, or 3): how many layers to use for computing the gradcam output. One layer referes to the last convolutional layer of the model. More than one layer and the gradcam activations will be computed as the average of the layers activation.
  • tags: the name of the folder where to save the gradcam activations

Citation

If your research uses part of our dataset, models and code, partially or in full, please cite:

@INPROCEEDINGS{11113429,
  author={Ziglio, Riccardo and Pasquini, Cecilia and Ranise, Silvio},
  booktitle={2025 13th International Workshop on Biometrics and Forensics (IWBF)}, 
  title={Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors}, 
  year={2025},
  volume={},
  number={},
  pages={01-06},
  keywords={Biometrics;Visualization;Forensics;Soft sensors;Conferences;Detectors;Real-time systems;Data models;Faces;Videos;face swapping;face verification;remote video calls;forensic detection},
  doi={10.1109/IWBF63717.2025.11113429}
}

About

Original implementation of our IWBF 2025 paper: Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published