Skip to content

semanticClimate/image_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Exploring Vision Transformers in Practice

Open in Colab

DOI Zenodo badge:

DOI

Citation

Hamadani, A., Kumari, R., Simon, W., Yadav, G., & Murray-Rust, P. (2025). Exploring Vision Transformers in Practice (0.1). Zenodo. https://doi.org/10.5281/zenodo.16734915

Description:

In this notebook, we fine-tune a pretrained ViT on the Fashion Products Small dataset from Hugging Face, which contains 42,700 e-commerce images of apparel and accessories (e.g., shirts, watches) along with metadata. The data is split into training, validation, and testing. Rather than training from scratch, the notebook uses ViT-Base-Patch16-224-in21k from Hugging Face’s Transformers library.

Vision Transformers in Academic Research

  • Understanding Visual Texts
  • Analyzing Literature Reviews
  • Multimodal Literature Review
  • Sorting Large Collections
  • Symbolism Analysis in Art
  • Visual Semantic Mapping

Link to Notebook

Reviewers & review process: <Add reviewers and review process link>


Software citation information: CITATION.cff

License: Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ | License information: LICENSE

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published