Sensitivity based Error Resilient Techniques for Energy Efficient Deep Neural Network Accelerators
- 주제(키워드) Convolutional Neural Network , Accelerator
- 발행기관 고려대학교 대학원
- 지도교수 박종선
- 발행년도 2019
- 학위수여년월 2019. 8
- 유형 Text
- 학위구분 석사
- 학과 대학원 전기전자공학과
- 세부전공 집적회로전공
- 원문페이지 50 p
- 실제URI http://www.dcollection.net/handler/korea/000000084559
- UCI I804:11009-000000084559
- DOI 10.23186/korea.000000084559.11009.0000933
- 본문언어 영어
- 제출원본 000045999266
초록/요약
With inherent algorithmic error resilience of deep neural networks (DNNs), supply voltage scaling could be a promising technique for energy efficient DNN accelerator design. This paper presents an error resilient techniques to enable aggressive voltage scaling by exploiting different amount of error resilience (sensitivity) with respect to DNN layers, filters, and channels. First, to evaluate filter/channel-level weight sensitivities of large scale DNNs, first-order Taylor expansion is used, which accurately approximates weight sensitivity from actual error injection simulation. With measured timing error probability of each multiply-accumulate(MAC) units considering process variations, the sensitivity variation among filter weights can be leveraged to design DNN Sensitivity based Error Resilient Techniques for Energy Efficient Deep Neural Network Accelerators vii accelerator, such that the computations with more sensitive weights are assigned to more robust MAC units, while those with less sensitive weights are assigned to less robust MAC units. We also present heterogeneous MAC units design approach, where the larger size (shorter critical path delay) MAC units are designed to be more robust to aggressive voltage scaling, while relatively smaller MAC units are less robust to voltage scaling. To exploit the timing error probabilities of heterogeneous MAC units with voltage scaling, the sensitivity variation among filter weights can be leveraged to design DNN accelerator, such that the computations with more sensitive weights are assigned to more robust (larger) MAC units, while with less sensitive weights are assigned to less robust (smaller) MAC units. Using dynamic programming, the sizes of MAC units are selected to achieve best DNN accuracy under ISO area constraint. As a result, with timing simulations using 65nm CMOS process, the proposed voltage scalable DNN accelerator achieves 41% energy savings with ImageNet dataset using ResNet-18 compared to state-of-the-art timing error recovery technique.
more목차
1. Introduction 1
2. Related Works and Background 4
2.1 Related Works 4
2.2 Deep Neural Networks 5
3. Sensitivity Analysis Using Taylor Expansion 7
4. PVT-Aware Hardware Design 12
4.1 Proposed DNN Accelerator Design 12
4.2 Hardware Profiling 13
4.3 Combining Weight Sensitivities and Hardware Profiling in DNN Accelerator Architecture 14
5. Heterogeneous MAC Units based Hardware Design 17
5.1 Modification to MAC Unit Design 17
5.2 Timing Error Probability Model of MAC units. 18
5.3 Sensitivity Map for 2D MAC Array 23
5.4 Problem Definition 27
5.5 Dynamic Programming based MAC Array Design 28
6. Experimental Results 30
6.1 Experimental Setup 30
6.2 PVT-Aware Hardware Design 30
6.3 Heterogeneous MAC Units based Hardware Design 31
6.4 Implementation Results 36
7. Conclusion. 38
List of References 39

