Modeling of Driver Attention in Real World Scenarios Using Probabilistic Salient Maps

dc.contributor.advisorBusso-Recabarren, Carlos A.
dc.contributor.advisorSchweitzer, Haim
dc.contributor.committeeMemberHansen, John H. L.
dc.contributor.committeeMemberAl-Dhahir, Naofal
dc.contributor.committeeMemberKehtarnavaz, Nasser
dc.creatorJha, Sumit
dc.date.accessioned2023-08-21T20:32:48Z
dc.date.available2023-08-21T20:32:48Z
dc.date.created2021-05
dc.date.issued2021-05-01T05:00:00.000Z
dc.date.submittedMay 2021
dc.date.updated2023-08-21T20:32:49Z
dc.description.abstractMonitoring driver behavior can play a vital role in combating various road hazards. The majority of accidents can be avoided if the driver gets an adequate warning few seconds prior to the event. Monitoring driver actions can provide insights about the driver’s intent, attention and vigilance. This information can be helpful in designing smart interfaces in the vehicle that provides necessary warning to the driver or take control when necessary. Visual attention is one of the most important factors in driver monitoring, since most driving maneuvers strongly rely on vision. An inattentive driver may lack awareness about the factors in the environment such as pedestrians, other vehicles and trac changes. Visual attention of a driver can be monitored by either tracking the driver’s head pose or by tracking their eye movement. While advancement in computer vision have inspired various studies that can eciently track head and eye movement from the face, these models face challenges in a naturalistic driving environment because of the changes in illumination, high head rotation and occlusions. This dissertation discusses various methods to predict the driver’s visual attention using probabilistic visual maps. We collect a large scale multimodal dataset where 59 drivers are recording when performing various secondary activities while driving, to capture the vi diversity of data in a naturalistic driving environment. The subjects fixate their gaze at predetermined location which help us establish a correspondence between the driver’s face and their gaze target. Using this dataset, we have performed various analysis that guided our proposed models to predict the driver’s visual attention. We establish that while the head pose of the driver has a strong correlation with the driver’s visual attention the relationship is not one to one. Hence, it is not feasible to design models that can predict a single value of driver’s gaze from the head pose. Therefore, we take a probabilistic approach where the driver’s visual attention is predicted as a probabilistic visual map whose value at each point depend on the probability that the driver is looking at a certain direction. First, we design parametric regression models that provide a Gaussian distribution of the driver’s gaze from the driver’s head pose. The model is heteroscedastic based on Gaussian Process Regression (GPR) which learns the distribution of gaze as a gaussian random process which is function of the head pose in 6 degrees of freedom. Next, we propose deep networks with convolutional and upsampling layers that performs classification on a 2D grid to obtain visual map. The model is non-parametric and learns the distribution from the data. We propose two di↵erent models. The first model takes the head pose of the driver as the input and passes it through a fully connected layer followed by convolution and upsampling to predict the visual attention at di↵erent resolutions. The second model takes an image of the eye patch as an input and passes it through multiple layers of convolution and maxpooling to obtain a low dimensional representation of the visual attention. Consecutively, this low dimensional representation is passed through upsampling and convolution layers to obtain a high dimension representation of visual attention. In our final approach, We design a fusion model that integrates the information from the driver’s head pose as well as their eye appearance to predict a visual attention map at multiple resolution. This model follows an encoder-decoder architecture with two encoders, one each for the head pose and the gaze and a decoder that concatenates the information from both the head pose and gaze to obtain the final visual map. We project the model prediction onto the road and evaluate it on data when the subject looks at the landmarks on the road.
dc.format.mimetypeapplication/pdf
dc.identifier.uri
dc.identifier.urihttps://hdl.handle.net/10735.1/9755
dc.language.isoen
dc.subjectEngineering, Electronics and Electrical
dc.titleModeling of Driver Attention in Real World Scenarios Using Probabilistic Salient Maps
dc.typeThesis
dc.type.materialtext
thesis.degree.collegeSchool of Engineering and Computer Science
thesis.degree.departmentElectrical Engineering
thesis.degree.grantorThe University of Texas at Dallas
thesis.degree.namePHD

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
JHA-PRIMARY-2022-1.pdf
Size:
65.71 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description: