Browsing by Author "Zhou, Dian"

Now showing 1 - 11 of 11

A General Graph Based Pessimism Reduction Framework For Design Optimization Of Timing Closure
(ACM) Peng, F.; Yan, C.; Feng, C.; Zheng, J.; Wang, S. -G; Zhou, Dian; Zeng, X.; Zhou, Dian
In this paper, we develop a general pessimism reduction framework for design optimization of timing closure. Although the modified graph based timing analysis (mGBA) slack model can be readily formulated into a quadratic programming problem with constraints, the realistic difficulty is the size of the problem. A critical path selection scheme, a uniform sampling method with the sparse characteristics of the optimal solution, and a stochastic conjugate gradient method are proposed to accelerate the optimization solver. This modified GBA is embedded into design optimization of timing closure. Experimental results show that the proposed solver can achieve 13.82x speedup than gradient descent method with similar accuracy. With mGBA, the optimization of timing closure can achieve a better performance on area, leakage power, buffer counts.
An Efficient Bayesian Yield Estimation Method for High Dimensional and High Sigma SRAM Circuits
(Institute of Electrical and Electronics Engineers Inc.) Zhai, J.; Yan, C.; Wang, S. -G; Zhou, Dian; Zhou, Dian
With increasing dimension of variation space and computational intensive circuit simulation, accurate and fast yield estimation of realistic SRAM chip remains a significant and complicated challenge. In this paper, du Experiment results show that the proposed method has an almost constant time complexity as the dimension increases, and gains 6x speedup over the state-of-the-art method in the 485D cases.
Bayesian Optimization Approach for Analog Circuit Synthesis Using Neural Network
(Institute of Electrical and Electronics Engineers Inc., 2019-03-25) Zhang, S.; Lyu, W.; Yang, F.; Yan, C.; Zhou, Dian; Zeng, X.; Zhou, Dian
Bayesian optimization with Gaussian process as surrogate model has been successfully applied to analog circuit synthesis. In the traditional Gaussian process regression model, the kernel functions are defined explicitly. The computational complexity of training is O(N³), and the computation complexity of prediction is O(N²), where N is the number of training data. Gaussian process model can also be derived from a weight space view, where the original data are mapped to feature space, and the kernel function is defined as the inner product of nonlinear features. In this paper, we propose a Bayesian optimization approach for analog circuit synthesis using neural network. We use deep neural network to extract good feature representations, and then define Gaussian process using the extracted features. Model averaging method is applied to improve the quality of uncertainty prediction. Compared to Gaussian process model with explicitly defined kernel functions, the neural-network-based Gaussian process model can automatically learn a kernel function from data, which makes it possible to provide more accurate predictions and thus accelerate the follow-up optimization procedure. Also, the neural-network-based model has O(N) training time and constant prediction time. The efficiency of the proposed method has been verified by two real-world analog circuits. © 2019 EDAA.
Efficient and Quality Assured Techniques for Analog Circuit Design Automation
(2017-12) Bi, Zhaori; 0000-0002-7315-3150 (Bi, Z); Zhou, Dian; Nourani, Mehrdad; Namgoong, Won; Lee, Hoi
Automating the designs of analog and mixed signal circuits is challenging because circuit designs are heuristics intensive and the performance evaluations are expensive. This dissertation addresses multiple strategies to enhance the quality and efficiency of the circuit design automation. With comparing various global optimization solvers such as Evolutionary Algorithm (EA), Simulated Annealing (SA) and Genetic Algorithms (GA), we introduce Random Region Covering (RRC) method as our global optimizer. RRC explores the landscape by initiating local optimization solvers with multiple random starting points. The optimization quality improves as the number of starting points increases. We propose Random Region Covering Theory (RRCT) theory to explain why this technique is efficient at searching for the global optimum. In addition to analyzing the efficiency of the RRC, the theory gives a probability-based estimation of the goodness of the optimization result. Quantifying the goodness of the current design has two advantages. First, we can estimate the improvement margin of the candidate design. In this case, we can avoid extra costs associated with over-optimizing a qualified design. Second, we can estimate the cost of achieving the design goal which provides a sound termination condition to the optimization flow. To enhance the efficiency, an optimization scheme should either speed up the circuit simulation or invoke the high-cost circuit simulator as little as possible. A common technique to improve circuit simulation efficiency is to replace the transistor level model with a behavior level model. However, the accuracy of equation-based or knowledge-based behavioral models is problem dependent. For new circuit topologies, these methods have to develop fitted mathematical models which are time consuming and difficult, particularly with respect to Process, Voltage and Temperature (PVT) variations. Instead of directly applying a numerical optimization algorithm to full transistor-level response surface, it is more efficient to apply the optimization to a surrogate model trained by an iteratively updated, high-fidelity simulation database. The accuracy of the surrogate model becomes the key to achieving high quality optimization results. This dissertation proposes a novel optimization scheme with combining the advantages of Gaussian process (GP) model with RRC optimizer. We perform experiments to compare the proposed technique with well-known Bayesian Optimization (BO) methods. The results proved the effectiveness of the proposed method. The DesignEasy software was developed to implement the above functions and to provide a general User Interface (UI) for circuit design automation.
Evaluation of an Interdigitated Array Electrode Device for IoT Water Testing
(2018-05) Kirkpatrick, Brandi James; Hu, Wenchuang (Walter); Zhou, Dian
Current methods of water quality testing require bulky lab equipment or additional chemical reagents added to samples to obtain measurements. It is desired to develop a device capable of evaluating multiple qualities of water samples in a single compact, stand-alone system. With such a device, integrating water testing into the developing Internet of Things would become a feasible system with considerable improvements over current solutions. The interdigitated array electrode (IDA) is a type of circuit that has variable capacitance depending on the environment it is exposed to. With all other properties defined upon the creation of the device, manipulating the permittivity of the surrounding medium results in the capability to measure properties of the medium itself by observing the overall change in the capacitance as well as the detected potential across the device. It is observed that such shifts possess a monotonic relationship to the total dissolved solids (TDS) levels and free chlorine content in aqueous environments similar to that of drinking water. The measurement method to observe these relationships is shown to have a greater sensitivity compared to alternate measurement methods resulting in a more robust system with higher resolution. Long term stability is a concern for electronic devices exposed to water such as IDAs. This paper evaluates two hydrophobic dielectric coatings, HfO2 and OTS-Al2O3, utilized as protective layers for the IDA device in water with considerably greater salinity than the expected operating conditions to prove that such devices are capable of maintaining proper signal integrity over time. Similar experimentation is demonstrated for chlorinated water and the corresponding readings for free chlorine measurement. Further developments that are underway to integrate the IDA into a complete IoT system and the complications or hindrances that must be overcome are also discussed in this paper, and details of alternative applications of the IDA utilizing similar measurement methods are explored.
Machine Learning-Empowered Smart Health Wearables for Long-Term High-Resolution Heart Rate and Blood Pressure Tracking
(2017-05) Zhang, Qingxue; Zhou, Dian
Wearable computers are paving a promising way for ubiquitous smart health applications. Longterm high-resolution heart rate and blood pressure tracking is highly significant not only for medical purposes such as heart diseases and hypertension management, but also for wellness and fitness applications such as emotion, stress and sport performance monitoring. It is highly challenging to robustly estimate the high-resolution (second-level) heart rate in longterm application scenarios, due to motion artifacts caused by diverse body movements. We propose two novel electrocardiogram (ECG)-based instantaneous heart rate monitoring frameworks tolerant to severe and continuous motion artifacts. The first framework is phase-domain multiview dynamic time warping, which transforms the raw ECG signal to the multi-dimensional phase space to reveal more geometrical characteristics of heartbeats, and then performs multi-view dynamic time warping to identify heartbeats similar to a predefined heartbeat template. The heart rate estimation performance greatly outperforms previously reported approaches. The second framework is a two-stage classification approach, which does not sweep all of the signal stream to find out heartbeats, but instead, just focuses on the signal spikes of interest to classify them as heartbeats or interferential spikes. Since it does not care about the non-spike signal segments, it consumes less computation load. We extract twenty-six features and select out ten features critical to motion artifacts, by a spare support vector machine (SVM). Then they are used to train a SVM model to perform the heartbeat identification. Afterwards, a refinement engine is introduced to purify the heartbeats. The performance is superior to many well-known approaches. Moreover, we propose two wearable cuff-less blood pressure monitoring systems, one with single-arm ECG and photoplethysmogram (PPG) signals and another with ear-ECG and PPG signals. In the first system, the weak single-arm signals are successfully acquired by our bio-potential acquisition platform, and the heartbeats are then identified from these weak signals. A thorough comparative analysis on diverse blood pressure models is also performed to determine an appropriate one for single-arm applications. Experimental results show that the system can robustly estimate the minute-level blood pressure only based on single-arm signals. In the second system, both weak ECG and PPG signals are also successfully acquired by placing all the sensors behind the ear for a super wearability. Moreover, we introduce large amounts of motion artifacts by performing head movements towards practical application scenarios, and then propose an unsupervised learning strategy to automatically evaluate the signal distortion level and perform signal purification. Experimental results also show the effectiveness of the proposed ear-worn blood pressure and heart rate monitoring system. These smart health wearables proposed are expected to contribute to pervasive health, wellness and fitness management.
Metastability-zone Based Quantization for High-speed Analog-to-digital Converters
(December 2023) Ren, Jiajun 1990-; Liu, Jin; Bereg, Sergey; Lee, Hoi; Zhou, Dian; Carrion Schaefer, Benjamin
High-speed analog-to-digital converters (ADCs) are highly demanded in both wireless and wireline communication systems. These systems normally require 6 to 8 bits resolution for further data processing and preferrable even higher resolution in some areas like satellite communication. However, due to the weak signal arrived at receivers, the system effective least significant bit (LSB) becomes small. In ADCs with traditional quantization technique, the LSB is always limited by the comparator offset. Meanwhile, high clock rate comparator with small LSB can easily fall into metastable state during quantization, which is a dominant source of bit-error-rate (BER). In this dissertation, a high-speed ADC quantization method based on the metastability-zones is presented. Compared with traditional comparator based quantization method, which requires sufficient regeneration time and large enough LSB to generate valid logic outputs, the proposed method can significantly reduce the regeneration time to achieve a higher sampling rate and a finer quantization scale. To prove the proposed quantization method, a MATLAB model is created for mathematically verification. Two 2-GS/s, 6-bit flash ADCs with different calibration schemes are designed in 130nm CMOS for block level verification. At the end, a 3.2-GS/s 6-bit flash ADC prototype is designed and fabricated in 65nm CMOS. The measured results show that the system achieves 5mV least significant bit, which is much smaller than the traditional flash ADCs using comparator based quantization and is in the range of comparator with interpolation technique design. At the same time, the sampling rate with proposed quantization method is higher. The figure-of-merit of 263 fJ/conv-step is the best compared with the state of the art design with similar resolution and sampling rate.
Multi-Objective Bayesian Optimization for Analog/RF Circuit Synthesis
(Institute of Electrical and Electronics Engineers Inc.) Lyu, W.; Yang, F.; Yan, C.; Zhou, Dian; Zeng, X.; Zhou, Dian
In this paper, a novel multi-objective Bayesian optimization method is proposed for the sizing of analog/RF circuits. The proposed approach follows the framework of Bayesian optimization to balance the exploitation and exploration. Gaussian processes (GP) are used as the online surrogate models for the multiple objective functions. The lower confidence bound (LCB) functions are taken as the acquisition functions to select the data point with best Pareto-dominance and diversity. A modified non-dominated sorting based evolutionary multi-objective algorithm is proposed to find the Pareto Front (PF) of the multiple LCB functions, and the next simulation point is chosen from the PF of the multiple LCB functions. Compared with the multi-objective evolutionary algorithms (MOEA) and the state-of-the-art online surrogate model based circuit optimization method, our method can better approximate the Pareto Front while significantly reduce the number of circuit simulations. © 2018 Association for Computing Machinery.
Multi-Objective Bayesian Optimization for Analog/RF Circuit Synthesis
(Institute of Electrical and Electronics Engineers Inc., 2019) Zeng, X.; Lyu, W.; Yang, F.; Yan, C.; Zhou, Dian; Zhou, Dian
In this paper, a novel multi-objective Bayesian optimization method is proposed for the sizing of analog/RF circuits. The proposed approach follows the framework of Bayesian optimization to balance the exploitation and exploration. Gaussian processes (GP) are used as the online surrogate models for the multiple objective functions. The lower confidence bound (LCB) functions are taken as the acquisition functions to select the data point with best Pareto-dominance and diversity. A modified non-dominated sorting based evolutionary multi-objective algorithm is proposed to find the Pareto Front (PF) of the multiple LCB functions, and the next simulation point is chosen from the PF of the multiple LCB functions. Compared with the multi-objective evolutionary algorithms (MOEA) and the state-of-the-art online surrogate model based circuit optimization method, our method can better approximate the Pareto Front while significantly reduce the number of circuit simulations. © 2018 Association for Computing Machinery.
Semi-Supervised Learning with Label Confidence for Automatic Knee Osteoarthritis Severity Assessment
(2022-05-01T05:00:00.000Z) Wang, Yifan; Zhou, Dian; Sarac, Kamil; Liu, Jin; Nourani, Mehrdad; Tamil, Lakshman
Knee osteoarthritis (OA) is a chronic disease that considerably reduces patients’ quality of life. Preventive therapies require early detection and lifetime monitoring of OA progression. In the clinical environment, the severity of OA is classified by the Kellgren and Lawrence (KL) grading system, ranging from KL-0 to KL-4. Recently, deep learning methods were ap- plied to OA severity assessment, to improve accuracy and efficiency. Researchers fine-tuned convolutional neural networks (CNN) on the OA dataset and built end-to-end approaches. However, this task is still challenging due to the ambiguity between adjacent grades, es- pecially in early-stage OA. Low confident samples, which are less representative than the typical ones, undermine the training process. Targeting the uncertainty in the OA dataset, we propose a novel learning scheme that dynamically separates the data into two sets ac- cording to their reliability. Besides, we design a hybrid loss function to help CNN learn from the two sets accordingly. With the proposed approach, we emphasize the typical samples and control the impacts of low confident cases. Experiments are conducted in a five-fold manner on five-class task and early-stage OA task. Our method achieves a mean accuracy of 70.13% on the five-class OA assessment task, which outperforms all other state-of-art methods. Despite early-stage OA detection still benefiting from the human intervention of lesion region selection, our approach achieves superior performance on the KL-0 vs. KL-2 task. Moreover, we design an experiment to validate large-scale automatic data refining during training. The result verifies the ability to characterize low confidence samples by our approach. The dataset used in this paper was obtained from the Osteoarthritis Initiative.
Towards a High-Performance and Secure Memory System and Architecture for Emerging Applications
(2021-12-01T06:00:00.000Z) Wang, Zhendong; Hu, Yang; Hays, Seth; Hansen, John H.L.; Liu, Cong; Zhou, Dian
In the 5G era, diverse types of artificial intelligence (AI) and Internet of Things (IoT) applications emerge in our life, such as smart homes, virtual reality, and autonomous vehicles. These applications typically impose diversified requirements in real deployments in terms of latency, privacy, security, etc., and stimulate the evolution and prosperity of heterogeneous computing. In this dissertation, heterogeneous computing indicates the scheme in which the different computing Processing Units (PUs) with differentiated computing capacities are effectively coordinated and managed to achieve computing gains. The representative PUs include CPU, Graphics Processing Unit (GPU), Field Programmable Gate Array (FPGA), Application-Specific Integrated Circuits (ASIC), and etc. As GPU has become one of the most promising and prevalent platforms to deploy emerging AI-enabled applications, this dissertation sets to discuss some key challenges and solutions of GPU-based heterogeneous system/architecture, especially the memory subsystem and management, matching the deployment requirements of emerging applications from the performance and security perspective. Regarding the challenges, the applications typically process huge volumes of data and computations and are memory-hungry, and can exhibit diverse computation properties and memory access patterns. In contrast, the GPU-based heterogeneous system/architecture, especially the GPU device, has limited memory capacity. Also, the CPU PU and GPU PU in the heterogeneous system have fundamentally different computing architectures and differentiated memory subsystems. Thus, there exists a ”memory wall” caused by the mismatch between the diversified applications properties and the GPU-based system heterogeneity, which damages the applications’ performance. On the other hand, applications face a variety of security and privacy risks during deployments. However, the GPU-based heterogeneous system, especially the memory subsystem, can expose multiple security vulnerabilities, damaging applications’ privacy. To address the challenges, we propose a memory and computing coordinated methodology to thoroughly exploit the characteristics and capabilities of the GPU-based heterogeneous system to effectively optimize applications’ performance and privacy. Specifically, 1) we propose a task-aware and dynamic memory management mechanism to co-optimize applications’ latency and memory footprint, especially in multitasking scenarios. 2) We propose a novel latency-aware memory management framework that analyzes the application characteristics and hardware features to reduce applications’ initialization latency and response time. 3) We develop a new model extraction attack that explores the vulnerability of the GPU unified memory system to accurately steal private DNN models. 4) We propose a CPU/GPU Co-Encryption mechanism that can defend against a timing-correlation attack in an integrated CPU/GPU platform to provide a secure execution environment for the edge applications. This dissertation aims at developing a high-performance and secure memory system and architecture in GPU heterogeneous platforms to deploy emerging AI-enabled applications efficiently and safely.