A deep learning project for generating Bangla (Bengali) captions from images, built with PyTorch. This repository provides tools for training, evaluating, and using image captioning models on Bangla datasets.
- End-to-end image captioning in Bangla
- Custom dataset support
- PyTorch-based modular code
- Easy training and evaluation scripts
-
Clone the repository:
git clone https://github.com/yourusername/Bangla-Image-Captioning.git cd Bangla-Image-Captioning
-
Install dependencies:
- Python 3.6+
- PyTorch 1.0 or later (pytorch.org)
- NLTK
Install with pip:
pip install torch nltk
Then, download NLTK data:
import nltk nltk.download()
- The default dataset is from BanglaLekha, but you can use your own dataset.
- Format:
Each line in your CSV should be:
Example (
/path/to/image1, "caption in Bangla"
train.csv
):0001.jpg, "একজন মেয়ে হাত বাড়িয়ে বৃষ্টির পানি ধরার চেষ্টা করছেন "
- Training:
Edit
config.py
as needed, then run:python train.py
- Vocabulary Building:
python build_vocabulary.py
- Custom Dataset:
Place your images and CSV in the appropriate folders and update paths in
config.py
.
Contributions, issues, and feature requests are welcome! Feel free to open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
- BanglaLekha for the dataset
- PyTorch and NLTK communities