The International Arab Journal of Information Technology (IAJIT)


Evaluation of Influence of Arousal-Valence Primitives on Speech Emotion Recognition

Speech Emotion recognition is a challenging research problem with a significant scientific interest. There has been a lot of research and development around this field in the recent times. In this article, we present a study which aims to improve the recognition accuracy of speech emotion recognition using a hierarchical method based on Gaussian Mixture Model and Support Vector Machines for dimensional and continuous prediction of emotions in valence (positive vs negative emotion) and arousal space (the degree of emotional intensity). According to these dimensions, emotions are categorized into N broad groups. These N groups are further classified into other groups using spectral representation. We verify and compare the functionality of the different proposed multi-level models in order to study differential effects of emotional valence and arousal on the recognition of a basic emotion. Experimental studies are performed over the Berlin Emotional database and the Surrey Audio-Visual Expressed Emotion corpus, expressing different emotions, in German and English languages.

Imen Trabelsi received her MS degree in signal processing in 2011 from the Institute of Computer Science of Tunis (ISI-Tunisia) and PhD degree in electrical engineering with specialization in signal processing in 2015 from the University of Tunis-El Manar (Tunisia). Her main areas of interests include: speech processing, pattern recognition, machine learning, artificial intelligence and emotion recognition.

Dorra Ben Ayed received computer science engineering degree in 1995 from the National School Computer Science (ENSI-Tunisia), the MS degree in electrical engineering (signal processing) in 1997 from the National School of Engineer of Tunis (ENITTunisia), the Ph.D. degree in electrical engineering (signal processing) in 2003 from (ENIT- Tunisia). She is currently an associate professor in the computer science department at the High Institute of Computer Science of Tunis (ISI-Tunisia). Her research interests include fuzzy logic, support vector machines, artificial intelligence, pattern recognition, speech recognition and speaker identification.

Noureddine Ellouze received a PhD degree in 1977 from l Institut National Polytechnique at Paul Sabatier University (Toulouse, France), and Electronic Engineer Diploma from ENSEEIHT in 1968 at the same university. In 1978, Dr Ellouze joined the Department of Electrical Engineering at the National School of Engineer of Tunis (ENIT-Tunisia), as Assistant Professor. In 1990, he became Professor in Signal Processing, Digital Signal Processing and Stochastic Process. He is now Director of Signal Processing Research Laboratory (LSTS) at ENIT. His research interests include Neural Networks and Fuzzy Classification, Pattern Recognition, Signal Processing and Image Processing applied in biomedical, Multimedia, and Man Machine Communication.