Two datasets are required:
- DeepFashion2
- USED
DeepFashion2 is available at this GitHub repository. It contains all the information about the dataset, and the instructions to download the data.
You will need to fill in a form to get the password to unzip the files. To train Mask R-CNN the dataset must be in the COCO format. The DeepFashion2 authors provided a script to generate COCO annotations that can be found here.
The official USED web page can be found here. The page contains the link to download the data.
The downloaded data contains many txt files with annotations, however some of them seems to be useless (maybe they are old and dirty files). Notebook U1 will consider only the useful ones.
The following figure shows the order in which notebooks must be executed to train, check and evaluate the recommendation system:
It performs the Mask R-CNN training on the DeepFashion2 dataset. The dataset must be in the COCO format.
Variant of Notebook 1 to perform the training on multiple GPUs.
It evaluates a trained Mask R-CNN model on the test set using the COCO evaluation metrics. Results are stored in a JSON file.
It creates a new dataset from USED made of images of isolated garments using Mask R-CNN. It loads a trained Mask R-CNN and makes inference on the whole USED dataset, performing clothes detection, classification and segmentation. For each detected garment it creates and save a new image containing the segmented part on a black background. It also stores the image file paths, event class label and garment class label into csv files.
It calculates and plots statistics on the dataset generated by Notebook U1.
Notebook to be optionally used to rebalance test and train set of the dataset generated by Notebook U1. It moves images from the csv test file to the csv train file.
It trains a pretrained ResNet50 model on the dataset generated by Notebook U1 to classify social events. The ResNet50 is pretrained on ImageNet and the convolutional parts are kept frozen during training.
It loads the model trained in Notebook 3B, unfreeze the last convolutional part of the ResNet50 and resume the training process.
Similar to Notebooks 3B and 3C, but they uses a ResNet50 classifier with an additional embedding layer (please refer to the presentation and the report for further information)
It evaluates the classifiers trained in Notebooks 3X on the test dataset generated by Notebook U1.