A New Front-End for Classification of Non-Speech Sounds: A Study on Human Whistle

Date

ORCID

Journal Title

Journal ISSN

Volume Title

Publisher

International Speech and Communication Association

item.page.doi

Abstract

Speech/non-speech sound classification is an important problem in audio diarization, audio document retrieval and advanced human interfaces. The focus of this study is on the development of spectral and temporal acoustic features for speech/non-speech sound classification based on production differences in speech versus whistle. Seven time- and frequency-domain based features are investigated. Performance of the proposed feature set for the task of speech/whistle classification is evaluated at frame level. This evaluation utilizes support vector machine (SVM) models and Gaussian mixture models (GMM) for back-end classifiers. At the frame-level, the proposed front-end fusion gives an absolute performance gain of +15.0% and +3.1% over MFCC with SVM and GMM based classifiers, respectively. This research will benefit the development of intelligent speech interfaces for identification, recognition, and speech coding, as a preprocessing step for real world audio streams.

Description

Keywords

Intonation (Phonetics), Speech sounds, Sound Identification, Frequency (Acoustics), Support vector machines

item.page.sponsorship

Rights

©2015 ISCA

Citation