Unsupervised Driving Anomaly Detection in Naturalistic Driving Scenarios


August 2022


Journal Title

Journal ISSN

Volume Title




New developments in advanced driver assistance systems (ADAS) can help drivers deal with risky driving maneuvers, preventing potential hazard scenarios. A key challenge in these systems is to determine when to intervene. While there are situations where the needs for intervention or feedback is clear (e.g., lane departure), it is often difficult to determine scenarios that deviate from normal driving conditions. These scenarios can appear due to errors by the drivers, presence of pedestrian or bicycles, or maneuvers from other vehicles. We formulate this problem as a driving anomaly detection, where the goal is to automatically identify cases that require intervention. We aim to create unsupervised multimodal solutions that do not depend on predefined rules, or hyperplanes learned from labeled data describing few target events. This model should recognize anomalous driving scenarios even if similar scenarios are never observed in the training data. Toward this goal, this dissertation focuses on three main transformative goals: (a) to build robust unsupervised methods for driving anomaly detection, (b) to make the approach scalable so multiple modalities can be easily added, and (c) to make the approach interpretable so it is intuitive to understand why a given segment is detected as anomalous. Our first aim is to build robust unsupervised methods for driving anomaly detection. We address this goal by proposing a novel conditional generative adversarial networks (GAN) where the models are constrained by the signals previously observed. The difference of the scores in the discriminator between the predicted and actual signals is used as a metric for detecting driving anomalies. Our model consider (1) physiological signals from the driver, (2) vehicle information obtained from the controller area network (CAN) bus sensor. The original model was implemented with fully connected layers and hand crafted features from the physiological and CAN-Bus signals. This model was improved with two important changes. First, we explore an end-to-end solution extracting feature representations directly from the data, using convolutional neural network (CNNs). This model also leverages temporal information using long short-term memory (LSTM). Second, we improve the anomaly score using a triplet-loss function to further contrast the predicted and actual signals. The triplet-loss function creates an unsupervised framework that rewards predictions closer to the actual signals, and penalizes predictions deviating from the expected signals. This approach maximizes the discriminative power of the feature embeddings to detect anomalies, leading to measurable improvements over the results observed by our previous approach implemented with fully connected layers. The second aim is to make the driving anomaly detection approach scalable so multiple modalities can be easily added. This is important as individual modalities have limitations. For example, by considering only the vehicle CAN-Bus data and driver’s physiological data, our proposed approach can only detect abnormal driving scenarios when the driver reacts to the driving environment. If a driver fails to notice an abnormal driving scenario, these signals will not change and our driving anomaly scores will fail to capture the event. A model should be scalable, so we can incorporate other modalities that, for example, describe the environmental information. Our proposed approach trains a conditional GAN to extract latent features from each modality, which are independently pre-trained. An attention mechanism model combines the latent representations from the modalities. The entire framework is trained with the triplet loss function to generate effective representations to discriminate normal and abnormal driving segments. This approach is implemented with five different modalities (vehicle’s CAN-Bus signals, driver’s physiological signals, distance to nearby pedestrians, distance to nearby vehicles and distance to nearby bicycles), achieving improved performance over alternative approaches. The third aim is to make the approach interpretable so it is possible to understand why a given segment is detected as anomalous. We address this goal with two alternative approaches. The first approach is an example-based query algorithm that combines the aforementioned attention-based conditional GAN model with the multi-label k-nearest neighbors (ML-KNN) algorithm. Our approach relies on few manually labeled driving segments that are efficiently used as anchors to retrieve the causes of driving anomalies in a given driving segment. These anchors are projected into the embedding created by unsupervised driving anomaly detection systems, providing an ideal space to compare an anomalous driving segment detected by the system with the anchors. The second alternative framework is an unsupervised approach based on the contrastive multiview coding (CMC) framework to capture the correlations in representations extracted from different modalities. The approach learns a more discriminative representation space for unsupervised anomaly driving detection. We use CMC to train our model to extract view-invariant factors by maximizing the mutual information between multiple representations from a given view, and increasing the distance of views from unrelated segments. The approach is efficient, scalable and interpretable, where the distances in the contrastive embedding for each view can be used to understand potential causes of the detected anomalies. The proposed solutions are evaluated and trained with 130 hours of naturalistic data manually annotated with driving events. The results demonstrate the benefits of the proposed solutions. Collectively, these advances represent transformative contributions to build scalable, interpretable, and discriminative algorithms to identify anomaly driving events.



Engineering, Electronics and Electrical