Seminar: Graduate Seminar

ECE Women Community

Enhancing DNN Computational Efficiency through Time-Domain Adaptive Scheduling Leveraging Approximation and Decomposition

Date: August,06,2024 Start Time: 13:00 - 14:00
Location: 1061, Meyer Building
Add to:
Lecturer: Ori Schweitzer
The increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This work introduces an adaptive microarchitecture that enhances area, power, and energy efficiency of DNN accelerators through approximated computations and decomposition, while preserving accuracy. Our solution improves DNN efficiency by leveraging adaptive resource allocation and simultaneous multi-threading (SMT). It exploits two prominent attributes of DNNs: resiliency and sparsity, of both magnitude and bit-level, where resiliency is the ability to maintain accuracy with relatively small degradation even when the computations are not accurate. Our microarchitecture decomposes the Multiply-and-Accumulate (MAC) into fine-grained elementary computational resources. Additionally, it employs an approximate representation that leverages dynamic and flexible allocation of decomposed computational resources through SMT (Simultaneous Multi-Threading), thereby enhancing resource utilization and optimizing power consumption. We further improve the efficiency by introducing a new Temporal SMT (tSMT) technique. This technique suggests processing computations from threads that are temporally adjacent by expanding the computational time window for resource allocation. Our simulation analysis, using a systolic array accelerator as a case study, indicates that our proposed microarchitecture can achieve more than 30% reduction in area and power, with an accuracy degradation of less than 1% in state-of-the-art DNNs in vision and natural language processing (NLP) tasks, compared to conventional processing elements (PEs) using 8-bit MAC units.

Lastly, we explore temporal resilience in diffusion models. Temporal resilience refers to the changes in a model’s ability to withstand inaccuracies in computations over time or across iterations in the sampling process. This means that varying degrees of approximation can be applied at different iterations, with some iterations tolerating greater errors compared to exact computations while still producing high-quality images. We investigate the implementation of pipelining to leverage this temporal resilience, aiming to enhance computational efficiency.

M.Sc. student under the supervision of Prof. Uri Weiser and Freddy Gabbay.

 

All Seminars
Skip to content