A Machine Learning Based Hard Fault Recuperation Model for Approximate Hardware Accelerators

Date

ORCID

Journal Title

Journal ISSN

Volume Title

Publisher

The Association for Computing Machinery

item.page.doi

Abstract

Continuous pursuit of higher performance and energy efficiency has led to heterogeneous SoC that contains multiple dedicated hardware accelerators. These accelerators exploit the inherent parallelism of tasks and are often tolerant to inaccuracies in their outputs, e.g. image and digital signal processing applications. At the same time, permanent faults are escalating due to process scaling and power restrictions, leading to erroneous outputs. To address this issue, in this paper, we propose a low-cost, universal fault recovery/repair method that utilizes supervised machine learning techniques to ameliorate the effect of permanent fault(s) in hardware accelerators that can tolerate inexact outputs. The proposed compensation model does not require any information about the accelerator and is highly scalable with low area overhead. Experimental results show, the proposed method improves the accuracy by 50% and decreases the overall mean error rate by 90% with an area overhead of 5% compared to execution without fault compensation.

Description

Full text access from Treasures at UT Dallas is restricted to current UTD affiliates (use the provided Link to Article). Non UTD affiliates will find the web address for this item by clicking the "Show full item record" link, copying the "dc.relation.uri" metadata and pasting it into a browser.

Keywords

Fault-tolerant computing, Machine learning, Supervised learning (Machine learning), Artificial intelligence, Computer-aided design, Signal processing--Digital techniques, Energy consumption, Computers, Self-organizing systems, Systems on a chip, Parallel programming (Computer science)

item.page.sponsorship

Rights

©2018 ACM

Citation