This project processes audio files of vowels, extracts cepstral coefficients, and predicts the vowel spoken using Tokhura’s distance. The system performs key signal processing steps like DC Shift correction, normalization, steady-state selection, Hamming windowing, autocorrelation, LPC coefficient calculation, cepstral coefficient extraction, and raised sine window application.
- Input Files: The project processes text files representing audio signals of vowels.
- DC Shift & Normalization: Corrects for DC bias and normalizes the input audio.
- Steady State Selection: Extracts steady-state portions of the vowel sounds.
- Hamming Window: Reduces spectral leakage in the signal.
- Autocorrelation & LPC Coefficients: Computes LPC coefficients using autocorrelation.
- Cepstral Coefficients: Extracts cepstral coefficients for further analysis.
- Vowel Prediction: Utilizes Tokhura's distance to predict the vowel from test files.
main.cpp
: The core C++ file that implements the vowel recognition system.input/
: Folder containing input text files of vowel sounds.output/
: Folder where the processed results (e.g., cepstral coefficients) are saved.test/
: Folder with test files used for vowel prediction.
- C++ Compiler: Ensure you have a C++ compiler installed (e.g., GCC, Visual Studio).
- Clone or Download the project files to your local machine.
- Compile the
main.cpp
file using any C++ compiler:g++ main.cpp -o vowel_recognition
- Run the compiled program:
./vowel_recognition
- The program will process files in the
input/
folder, generate outputs in theoutput/
folder, and predict vowels from files in thetest/
folder.
- The program will process files in the
- View Results: The predicted vowel and other results will be displayed in the terminal, and detailed outputs will be saved in the
output/
folder.
- DC Shift and Normalization: Adjusts the audio signals for consistency.
- Steady State Selection: Extracts the most stable segment of the vowel sound.
- Hamming Window: Applied to minimize spectral leakage.
- Autocorrelation & LPC Coefficients: Calculates Linear Predictive Coding coefficients.
- Cepstral Coefficients: Extracted from LPC coefficients for effective vowel classification.
- Tokhura's Distance: Measures the similarity between test and reference vowels to make predictions.
- The results will be displayed in the terminal, showing the predicted vowel for each test file.
- Processed data (like cepstral coefficients) will be saved in the
output/
folder.
This project is licensed under the MIT License.