An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
-
Updated
Jul 25, 2024 - Python
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.
Video-Text Representation Learning via Differentiable Weak Temporal Alignment (CVPR 2022)
Text from the video is extracted and saved into a .docx file in the form of notes.
MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian (Bahasa Indonesia).
Capstone project for UPSchool AI First Developer Program
This is the folder for my thesis paper. The PDF file of the paper is included.
Its a full-screen video behind text animation using Next.js
Add a description, image, and links to the video-text topic page so that developers can more easily learn about it.
To associate your repository with the video-text topic, visit your repo's landing page and select "manage topics."