Now showing items 1-3 of 3
Deep Neural Networks and Model-Based Approaches for Robust Speaker Diarization in Naturalistic Audio Streams
Speaker diarization is an unsupervised task that determines "who spoke and when" within input audio stream. It consists of four sub-systems: (i) speech activity detection (SAD); (ii) speaker segmentation and modeling; ...
Domain Adaptation for Speech Based Emotion Recognition
One of the main barriers in the deployment of speech emotion recognition systems in real applications is the lack of generalization of the emotion classifiers. The recognition performance achieved in controlled recordings ...
Novel Frameworks for Attribute-Based Speech Emotion Recognition using Time-Continuous Traces and Sentence-Level Annotations
Speech emotion recognition (SER) plays an important role in a growing world of automation and artificial intelligence. Robust and accurate SER systems are crucial for enhancing human-computer interaction. Emotional ...