Skip to content

Commit 2c881c7

Browse files
committed
added force alignment docs
1 parent 18e79a0 commit 2c881c7

28 files changed

+6853
-318
lines changed

README-pypi.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ Features
2727

2828
- **Age Detection**, detect age in speech using Finetuned Speaker Vector.
2929
- **Speaker Diarization**, diarizing speakers using Pretrained Speaker Vector.
30+
- **Force Alignment**, generate a time-aligned transcription of an audio file using RNNT.
3031
- **Emotion Detection**, detect emotions in speech using Finetuned Speaker Vector.
3132
- **Gender Detection**, detect genders in speech using Finetuned Speaker Vector.
3233
- **Language Detection**, detect hyperlocal languages in speech using Finetuned Speaker Vector.

README.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ Features
4646
- **Age Detection**, detect age in speech using Finetuned Speaker Vector.
4747
- **Speaker Diarization**, diarizing speakers using Pretrained Speaker Vector.
4848
- **Emotion Detection**, detect emotions in speech using Finetuned Speaker Vector.
49+
- **Force Alignment**, generate a time-aligned transcription of an audio file using RNNT.
4950
- **Gender Detection**, detect genders in speech using Finetuned Speaker Vector.
5051
- **Language Detection**, detect hyperlocal languages in speech using Finetuned Speaker Vector.
5152
- **Multispeaker Separation**, Multispeaker separation using FastSep on 8k Wav.

docs/README.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ Features
4646
- **Age Detection**, detect age in speech using Finetuned Speaker Vector.
4747
- **Speaker Diarization**, diarizing speakers using Pretrained Speaker Vector.
4848
- **Emotion Detection**, detect emotions in speech using Finetuned Speaker Vector.
49+
- **Force Alignment**, generate a time-aligned transcription of an audio file using RNNT.
4950
- **Gender Detection**, detect genders in speech using Finetuned Speaker Vector.
5051
- **Language Detection**, detect hyperlocal languages in speech using Finetuned Speaker Vector.
5152
- **Multispeaker Separation**, Multispeaker separation using FastSep on 8k Wav.

docs/force-alignment.ipynb

Lines changed: 734 additions & 0 deletions
Large diffs are not rendered by default.

docs/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ Contents:
4949
realtime-asr
5050
realtime-asr-mixed
5151
realtime-asr-rubberband
52-
realtime-force-alignment
52+
realtime-alignment
53+
force-alignment
5354

5455
.. toctree::
5556
:maxdepth: 2

docs/load-stt-transducer-model-mixed.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -667,7 +667,7 @@
667667
"cell_type": "markdown",
668668
"metadata": {},
669669
"source": [
670-
"### Predict force alignment\n",
670+
"### Predict alignment\n",
671671
"\n",
672672
"We want to know when the speakers speak certain words, so we can use `predict_timestamp`,\n",
673673
"\n",

docs/load-stt-transducer-model.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -835,7 +835,7 @@
835835
"cell_type": "markdown",
836836
"metadata": {},
837837
"source": [
838-
"### Predict force alignment\n",
838+
"### Predict alignment\n",
839839
"\n",
840840
"We want to know when the speakers speak certain words, so we can use `predict_timestamp`,\n",
841841
"\n",

docs/realtime-force-alignment.ipynb renamed to docs/realtime-alignment.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Realtime Force Alignment\n",
7+
"# Realtime Alignment\n",
88
"\n",
99
"Let say you want to align realtime recording / input, malaya-speech able to do that."
1010
]
@@ -15,7 +15,7 @@
1515
"source": [
1616
"<div class=\"alert alert-info\">\n",
1717
"\n",
18-
"This tutorial is available as an IPython notebook at [malaya-speech/example/realtime-force-alignment](https://github.com/huseinzol05/malaya-speech/tree/master/example/realtime-force-alignment).\n",
18+
"This tutorial is available as an IPython notebook at [malaya-speech/example/realtime-alignment](https://github.com/huseinzol05/malaya-speech/tree/master/example/realtime-alignment).\n",
1919
" \n",
2020
"</div>"
2121
]

example/force-alignment/force-alignment.ipynb

Lines changed: 734 additions & 0 deletions
Large diffs are not rendered by default.

example/realtime-force-alignment/realtime-force-alignment.ipynb renamed to example/realtime-alignment/realtime-alignment.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Realtime Force Alignment\n",
7+
"# Realtime Alignment\n",
88
"\n",
99
"Let say you want to align realtime recording / input, malaya-speech able to do that."
1010
]
@@ -15,7 +15,7 @@
1515
"source": [
1616
"<div class=\"alert alert-info\">\n",
1717
"\n",
18-
"This tutorial is available as an IPython notebook at [malaya-speech/example/realtime-force-alignment](https://github.com/huseinzol05/malaya-speech/tree/master/example/realtime-force-alignment).\n",
18+
"This tutorial is available as an IPython notebook at [malaya-speech/example/realtime-alignment](https://github.com/huseinzol05/malaya-speech/tree/master/example/realtime-alignment).\n",
1919
" \n",
2020
"</div>"
2121
]

0 commit comments

Comments
 (0)