I-Vector Based Physical Task Stress Detection with Different Fusion Strategies
Hansen, John H. L.
MetadataShow full item record
It is common for subjects to produce speech while performing a physical task where speech technology may be used. Variabilities are introduced to speech since physical task can influence human speech production. These variabilities degrade the performance of most speech systems. It is vital to detect speech under physical stress variabilities for subsequent algorithm processsing. This study presents a method for detecting physical task stress from speech. Inspired by the fact that i-vectors can generally model total factors from speech, a state-of-the-art ivector framework is investigated with MFCCs and our previously formulated TEO-CB-Auto-Env features for neutral/physical task stress detection. Since MFCCs are derived from a linear speech production model and TEO-CB-Auto-Env features employ a nonlinear operator, these two features are believed to have complementary effects on physical task stress detection. Two alternative fusion strategies (feature-level and score-level fusion) are investigated to validate this hypothesis. Experiments over the UT-Scope Physical Corpus demonstrate that a relative accuracy gain of 2.68% is obtained when fusing different feature based i-vectors. An additional relative performance boost with of 6.52% in accuracy is achieved using score level fusion.