Emotional tone recognition from speech and text: a supervised machine learning and affective computing approach

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This report presents a system for classification of emotional tone in speech and text sequences using machine learning models ranging from a number of shallow atemporal models to recurrent deep learning. Identifying emotion from the speech is always a significant task. Emotion recognition experiment is carried out on speech features, text features from the speech transcriptions and combination setting of both speech and text features. We build a Long Short-Term Memory classifier to recognize emotion when a speech signal is given as input. The model is evaluated on the IEMOCAP, under multiple settings, namely, Audio-Only, Text-Only, and Audio + Text. For comparison, we have two approaches. For both, we extract eight features from the audio signal. In the first approach, the extracted features are used to train six machine learning classifiers, and in the second approach, a feedforward neural network and an LSTM-based classifier are used. The experiment is conducted in three experimental settings as Audio-Only, Text-Only, and Combined Setting where Audio features, Text features, and both are used for training the model respectively.

Description

Keywords

Emotion recognition, LSTM, Machine learning, Deep learning

Graduation Month

December

Degree

Master of Science

Department

Department of Computer Science

Major Professor

William H. Hsu

Date

2021

Type

Report

Citation