A New Front-End for Classification of Non-Speech Sounds: A Study on Human Whistle

Nandwana, Mahesh Kumar; Bořil, Hynek; Hansen, John H. L.

A New Front-End for Classification of Non-Speech Sounds: A Study on Human Whistle

Files

JECS-3626-4679.56.pdf (357.36 KB)

Authors

Nandwana, Mahesh Kumar

Bořil, Hynek

Hansen, John H. L.

Publisher

International Speech and Communication Association

URI

http://hdl.handle.net/10735.1/5061

Abstract

Speech/non-speech sound classification is an important problem in audio diarization, audio document retrieval and advanced human interfaces. The focus of this study is on the development of spectral and temporal acoustic features for speech/non-speech sound classification based on production differences in speech versus whistle. Seven time- and frequency-domain based features are investigated. Performance of the proposed feature set for the task of speech/whistle classification is evaluated at frame level. This evaluation utilizes support vector machine (SVM) models and Gaussian mixture models (GMM) for back-end classifiers. At the frame-level, the proposed front-end fusion gives an absolute performance gain of +15.0% and +3.1% over MFCC with SVM and GMM based classifiers, respectively. This research will benefit the development of intelligent speech interfaces for identification, recognition, and speech coding, as a preprocessing step for real world audio streams.

Keywords

Intonation (Phonetics), Speech sounds, Sound Identification, Frequency (Acoustics), Support vector machines

Rights

Collections

Hansen, John H. L.

Full item page

A New Front-End for Classification of Non-Speech Sounds: A Study on Human Whistle

Files

Date

Authors

ORCID

Journal Title

Journal ISSN

Volume Title

Publisher

item.page.doi

URI

Abstract

Description

Keywords

item.page.sponsorship

Rights

Citation

Collections