Multimodal Conditional Information Bottleneck for Generalizable AI-Generated Image Detection

Official implement of Multimodal Conditional Information Bottleneck for Generalizable AI-Generated Image Detection.

Haotian Qin¹, Dongliang Chang^1*, Yueying Gao¹, Bingyao Yu², Lei Chen², Zhanyu Ma¹

¹Beijing University of Posts and Telecommunications, Beijing, China
²Tsinghua University, Beijing, China

*Corresponding author.

News

June 2025: Released training code and GenImage caption annotations. The initial codebase is now available for public use, including scripts for training the model and annotations for the GenImage dataset.

TODO

Validation Code: Implement and release validation scripts to evaluate the model's performance on various datasets.
Inference Code: Develop and share inference scripts for applying the trained model to new data for AI-generated image detection.
Plug-and-Play: Add a modular, plug-and-play implementation of the MMCIB framework to facilitate easy integration into other networks.

Installation

To get started with the project, follow these steps:

Clone the repository:

git clone https://github.com/Ant0ny44/InfoFD.git

Install dependencies:

conda env create -f environment.yml -n infoFD

Modify the env.ini.
```
[WANDB]
TOKEN=YOUR_WANDB_TOKEN_HERE
```

Change the data path in configs/EP1.yml:

 data:


 # This is used to specify the cache path for training/validation/test data. 
 # You can also directly provide the path to preprocessed training/validation/test data here.
 # When using cached training, if the specified file path does not exist, 
 # preprocessing will be performed first, and the resulting data will be stored in the corresponding cache path.
 # Note that for training data, captions corresponding to the images are required.
 
 train_root_cache:  TRAIN_ROOT_CACHE_PATH 
 val_root_cache: VAL_ROOT_CACHE_PATH
 test_root_cache: TEST_ROOT_CACHE_PATH

 train_root: GENIMAGE_TRAIN_PATH
 train_captions_path: ./data/genImage_train_captions.json
 val_root: GENIMAGE_VAL_PATH
 test_root: GENIMAGE_TEST_PATH


 prompts: True
 shuffle: True
 num_workers: 14
 batch_size:  512
 ...

Usage

Training

The training code is available in the train.py directory. To train the model, run:

bash scripts/train_EP1.sh

To run the statistic results, run:

bash scripts/train_EP1_stas.sh

See the configs/ directory for model details.

GenImage Training Caption Annotations

The GenImage training caption annotations are available in the data/genImage_train_captions.json directory. These annotations provide textual descriptions for the GenImage dataset, generating by InternVL.

Contact

Thank you for your interest. We're currently finalizing the code organization. If you have any questions, please don't hesitate to reach out at ant0ny@163.com.

Citation

If you use this code or the GenImage annotations in your research, please cite our work using the following BibTeX entry:

@article{qin2025multimodal,
  title={Multimodal Conditional Information Bottleneck for Generalizable AI-Generated Image Detection},
  author={Qin, Haotian and Chang, Dongliang and Gao, Yueying and Yu, Bingyao and Chen, Lei and Ma, Zhanyu},
  journal={arXiv preprint arXiv:2505.15217},
  year={2025}
}

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ckpgs		ckpgs
configs		configs
data		data
models		models
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
env.ini		env.ini
environment.yml		environment.yml
eval.py		eval.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multimodal Conditional Information Bottleneck for Generalizable AI-Generated Image Detection

Haotian Qin¹, Dongliang Chang^1*, Yueying Gao¹, Bingyao Yu², Lei Chen², Zhanyu Ma¹

News

TODO

Installation

Usage

Training

GenImage Training Caption Annotations

Contact

Citation

License

About

Uh oh!

Releases

Packages

Languages

License

Ant0ny44/InfoFD

Folders and files

Latest commit

History

Repository files navigation

Multimodal Conditional Information Bottleneck for Generalizable AI-Generated Image Detection

Haotian Qin1, Dongliang Chang1*, Yueying Gao1, Bingyao Yu2, Lei Chen2, Zhanyu Ma1

News

TODO

Installation

Usage

Training

GenImage Training Caption Annotations

Contact

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Haotian Qin¹, Dongliang Chang^1*, Yueying Gao¹, Bingyao Yu², Lei Chen², Zhanyu Ma¹

Packages