This repository provides training and evaluation code for the paper
GatedxLSTM: A Multimodal Affective Computing Approach for Emotion Recognition in Conversations. [Paper Link]
Key contributions of GatedxLSTM include:
- CLAP-based Cross-modal Alignment: Incorporating Contrastive Language-Audio Pretraining for improved speech-text alignment.
- Gated Modality Fusion: A gating mechanism to emphasise emotionally salient utterances.
- Dialogical Emotion Decoder (DED): Captures context-aware emotional transitions over conversation turns.
Download the IEMOCAP dataset and extract it to your local directory.
Install required dependency packages.
Update the dataset path in ./data/preprocess.py
, then run:
python ./data/preprocess.py
To train and evaluate the model, run:
python ./Dialogical-Emotion-Decoding/main.py
This work has been accepted at ACII 2025.
Welcome to cite our paper:
@article{li2025gatedxlstm,
title={GatedxLSTM: A Multimodal Affective Computing Approach for Emotion Recognition in Conversations},
author={Li, Yupei and Sun, Qiyang and Murthy, Sunil Munthumoduku Krishna and Alturki, Emran and Schuller, Bj{\"o}rn W},
journal={arXiv preprint arXiv:2503.20919},
year={2025}
}