Whisper Audio Transcription

このプロジェクトは、MP4ファイルから音声を抽出し、Whisperモデルを使用して文字起こしを行うPythonスクリプトです。

必要条件

Python 3.x
ffmpeg
Whisper

インストール

ffmpegをインストールし、パスを通します。
- ffmpegのダウンロードページからダウンロードできます。
- C:/ffmpeg/binにインストールし、環境変数に追加します。
必要なPythonパッケージをインストールします。

conda create -n whisper-env python=3.11
conda activate whisper-env
conda install numpy=1.26 -y
conda install libuv -y
python -m pip install torch==2.5.1+cxx11.abi torchvision==0.20.1+cxx11.abi torchaudio==2.5.1+cxx11.abi intel-extension-for-pytorch==2.5.10+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/mtl/us/
pip install openai-whisper

Intel XE GPU を使いたいので、IPEX を使えるようにする必要がある。 https://pytorch-extension.intel.com/installation?platform=gpu&version=v2.5.10%2Bxpu&os=windows&package=pip

が、今のところ使えてない。IPEX 自体が実験的な存在。

使い方

スクリプトを実行します。

python convert_and_transcribe.py input.mp4

スクリプトは以下の手順で動作します。
- MP4ファイルから音声を抽出し、MP3ファイルを生成します。
- Whisperモデルを使用してMP3ファイルを文字起こしし、テキストファイルを生成します。

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.vscode		.vscode
.gitignore		.gitignore
README.md		README.md
checkIntelGPU.py		checkIntelGPU.py
convert_and_transcribe.py		convert_and_transcribe.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Whisper Audio Transcription

必要条件

インストール

使い方

About

Uh oh!

Releases

Packages

Uh oh!

Languages

WangTKurata/convert-transcribe

Folders and files

Latest commit

History

Repository files navigation

Whisper Audio Transcription

必要条件

インストール

使い方

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages