Browsing by Author "Shankar, Nikhil"

Now showing 1 - 2 of 2

Dual Microphone Speech Enhancement Algorithms on Android-Based Devices for Hearing Study
(2018-08) Shankar, Nikhil; Panahi, Issa M. S
Speech Enhancement (SE) is a key module in the Hearing Aid (HA) signal processing pipeline and improves the listening comfort. Over the last few decades, researchers have developed many single and dual-microphone SE techniques. In this thesis, two novel dual-channel SE techniques have been proposed and are implemented on Android-based smartphones as an assistive device for HA. In the first algorithm, the coherence between speech and noise signals is used to obtain an SE gain function, in combination with a Super-Gaussian Joint Maximum a Posteriori (SGJMAP) single microphone SE gain function. The second technique uses the Minimum Variance Distortionless Response (MVDR) as a Signal to Noise Ratio (SNR) booster for the SE method. The considered SE gain is based on the Log Spectral Minimum Mean Square Error Amplitude Estimator (Log-MMSE) to improve the speech quality in the presence of different background noise. Objective evaluation and subjective results of the developed methods show significant improvements in speech quality and intelligibility in comparison with existing SE methods.
Real-Time Single and Dual-Channel Speech Enhancement on Edge Devices for Hearing Applications
(2021-04-26) Shankar, Nikhil; Panahi, Issa M.S.
Speech Enhancement (SE) is an important module in the signal processing pipeline for hearing applications and it helps enhance the comfort of listening. Many single and dualmicrophone SE techniques have been developed by researchers over the last few decades. In this thesis, novel single and dual-channel SE techniques have been proposed and are implemented on edge devices as an assistive tool for hearing applications. The smartphone is considered as the processing platform for real-time implementation and testing. In this work, both statistical signal processing and deep learning algorithms are proposed for SE. Firstly, we compare different two-channel beamformers for SE. Later, the Minimum Variance Distortionless Response (MVDR) beamformer assisted by a voice activity detector (VAD) is used as a Signal to Noise Ratio (SNR) booster for the SE method. Deep neural network architectures comprising of convolutional neural network (CNN) and recurrent neural network (RNN) layers are proposed in this thesis for real-time SE. Finally to filter out background noise, the SE gain estimation for noisy speech mixture is smoothed along the frequency axis by a Mel filter-bank, resulting in a Mel-warped frequency-domain gain estimation. In comparison with existing SE methods, objective assessment and subjective results of the developed methods indicate substantial improvements in speech quality and intelligibility.