E. Spyrou, I. Vernikos, R. Nikopoulou, Ph. Mylonas |
A Non-Linguistic Approach for Human Emotion Recognition from Speech, |
IEEE International Conference on Information, Intelligence, Systems and Applications 2018 (IISA), July 2018, Zakynthos, Greece |
ABSTRACT
|
One of the most important issues in several aspects of human-computer interaction is the understanding of the usersą emotional state. In several applications such as monitoring of humans in assistive living environments, or assessing studentsą affective state during a course, it is imperative to use an unobtrusive method, so as to avoid discomforting or distracting the user. Thus, one should opt for approaches that use either visual or audio sensors which may observe users, without any kind of direct contact. In this work, our goal is to recognize the emotional state of humans using only the non-linguistic aspect of speech information, i.e., the acoustic properties of speech. Therefore, we propose an emotion classification approach that is based on the bag-of-visual words model that has been previously applied in many computer vision tasks. A given audio segment is transformed to a spectrogram, i.e., a visual representation of its spectrum. From this representation we first extract SURF features and using a previously constructed visual vocabulary, we quantize them into a set of visual words. Then a histogram is constructed per image; These feature vectors are used to train SVM classifiers. Our approach is evaluated using a) 3 publicly available datasets that contain speech from different languages and b) a custom dataset that has been constructed during a realife classroom experiments, involving middle-school students.
|
23 July , 2018 |
E. Spyrou, I. Vernikos, R. Nikopoulou, Ph. Mylonas, " A Non-Linguistic Approach for Human Emotion Recognition from Speech, ", IEEE International Conference on Information, Intelligence, Systems and Applications 2018 (IISA), July 2018, Zakynthos, Greece |
[ PDF] [
BibTex] [
Print] [
Back] |