Towards AI and Hardware Synergy
Date
Authors
ORCID
Journal Title
Journal ISSN
Volume Title
Publisher
item.page.doi
Abstract
This research explores the symbiotic relationship between Artificial Intelligence (AI) and hardware, with a specific emphasis on the intersection of AI and Hardware. In this dissertation, we consider two research thrusts: (i) AI for Hardware and (ii) Hardware for AI. In recent years, there has been a widespread adoption of custom hardware-based AI solutions to solve a plethora of real-world problems. For instance, researchers have proposed the incorporation of AI in numerous mission-critical applications, especially in high-assurance environments. To this end, in the first research thrust, we focus on developing AI techniques catered to hardware. Specifically, we propose novel low-latency and high-fidelity AI workloads to ensure the reliability of automotive hardware. On the other hand, the second research thrust is associated specialized hardware for AI. Despite the ubiquitous use of AI solutions in various real-world applications, such as facial recognition and autonomous vehicles, their deployment on hardware renders inefficiency, especially in resource-constrained platforms. Therefore, the second research thrust aims to facilitate efficient deployment of AI workloads on dedicated hardware platforms. To this end, we have formulated two main problems in this research, which are explained in detail below. The first aspect of this dissertation focuses on ensuring the Functional Safety (FuSa) of automotive systems. With increasing prevalence of safety-critical applications in the automotive domain, it is imperative to ensure the FuSa of circuits and components within the associated systems, which are predominantly Analog and Mixed-Signal (AMS) circuits. However, existing AI-based AMS FuSa violation detection solutions are limited by predefined feature inputs and lack a rationale for determining signals to be monitored for anomaly detection. To address these challenges, we propose a novel unsupervised Machine Learning (ML)-based framework for early anomaly detection in automotive AMS circuits. Our approach comprises the injection of anomalies into automotive AMS circuits in order to generate a comprehensive anomaly model, novel centroid selection and time-series methodologies for expedited high-fidelity anomaly detection. The proposed anomaly detection framework furnishes up to 100% detection accuracy and reduced the associated latency by 5× (compared to the non-time series approach). Subsequently, we augment our existing solution via novel feature and signal selection techniques, as well as an Explainable AI (XAI) framework for enhanced user interpretability and transparency. We achieve up to 7.2% improvement in detection accuracy and 2.3× reduction in detection latency over our prior approach. Following this, we perform anomaly abstraction to study the impact of anomalies across multiple abstraction levels in automotive systems, wherein we achieve high-fidelity anomaly detection (up to 100% detection accuracy) in both component-level and block-level implementations. Moreover, since we aim to deploy our AI solution on-chip for in-field applications, it is imperative to enable efficient resource utilization for real-world applicability. This necessitates real-time and low-power AI workload deployment, which we describe in our second research thrust. The second aspect of this dissertation pertains to Application-Specific Integrated Circuits (ASICs), such as Deep Neural Network (DNN) hardware accelerators, wherein we optimize the energy efficiency of DNN inference. The proliferation of DNNs in recent years has led to their widespread application in addressing a myriad of real-world challenges. However, due to the significant computational and power requirements, specialized hardware platforms, such as DNN accelerators, have been developed. Despite these advancements, DNN inference execution is associated with energy bottlenecks in these resource-constrained accelerators. To address these issues, we first propose a novel low-power hardware-based memory compression solution catered to commercial DNN accelerators. Our approach, which optimizes the memory subsystem of deep learning systems, involves hardware-based post-quantization weight trimming, followed by dictionary-based compression, and subsequent decompression by a low-power hardware engine during inference in the accelerator. Our technique furnishes up to 28571× reduction in memory footprint, while incurring negligible area and power decompression overheads of around 0.02% and 0.002%, respectively. Following this, we propose a novel sensor compression solution designed to optimize the energy efficiency of DNN sensor subsystems. The proposed approach employs a two-step approach involving subsampling followed by supersampling of sensor images through interpolation. Furthermore, we develop a fault injection framework to assess the fault resilience of DNNs (with compressed sensor inputs) to bit-flip faults in the DNN accelerator memory. Our solution furnishes up to 62.1% energy savings with marginal performance degradation of 0.83%. Furthermore, from our results, we can infer that DNNs accelerators witness up to 21.56% loss in classification accuracy for compressed sensor inputs, rendering them highly vulnerable to bit-flip faults manifested in their memory blocks. Therefore, by optimizing both the memory and sensor subsystems, we seek to enhance the overall efficiency and performance of of deep learning systems, particularly in resource-constrained environments. In conclusion, this dissertation proposes pioneering approaches to achieve synergy between AI and hardware, with the objective of improving the performance, safety, and power efficiency of systems situated at the confluence of AI and hardware. This research describes novel approaches to address the challenges pertaining to automotive AMS functional safety and low-power DNN implementation in resource-constrained IoT edge devices, offering valuable contributions to the AI and hardware research communities, and can foster the development of more robust and efficient systems.