This is a Pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector using MobileNetV3 as the feature extractor.
Most parts of this implementation are taken from https://github.com/argman/EAST
This project using Pytorch - An open source machine learning framework
Download the project
git clone https://github.com/ishin-pie/east-mobilenet.git
Installing from requirements.txt file
cd east-mobilenet
pip install -r requirements.txt
Note: we suggest you to install on the python virtual environment
Learn more: Installing Deep Learning Frameworks on Ubuntu with CUDA support
Running the demo of our pre-trained model
python demo.py -m=model_best.pth.tar -i=[path to your image]
checking your result in "demo" folder
or you can run camera demo (work fine on GPU machine)
python camera_demo.py -m=model_best.pth.tar
During training, we use ICDAR 2015 Traning Set and ICDAR 2017 Training Set (Latin only). In addition, we use ICDAR 2015 Test Set for validating our model.
The dataset should structure as follows:
[dataset root directory]
├── train_images
│ ├── img_1.jpg
│ ├── img_2.jpg
│ └── ...
├── train_gts
│ ├── gt_img_1.txt
│ ├── gt_img_2.txt
│ └── ...
└── test_images
├── img_1.jpg
├── img_2.jpg
└── ...
Note: the [dataset root directory] should be placed in "config.json" file.
Sample of ground truth format:
x1,y1,x2,y2,x3,y3,x4,y4,script,transcription
Training the model by yourself
python train.py
Note: check the "config.json" file, which is used to adjust the training configuration.
Experiment on GEFORCE RTX 2070