Domain Wall Memory based Design Techniques for Low Complexity Digital Signal Processor Implementation
- 주제(키워드) DWM , DSP , CNN , BNN
- 발행기관 고려대학교 대학원
- 지도교수 박종선, 권대한
- 발행년도 2019
- 학위수여년월 2019. 8
- 학위구분 박사
- 학과 대학원 전기전자공학과
- 세부전공 집적회로
- 원문페이지 116 p
- 실제URI http://www.dcollection.net/handler/korea/000000084322
- UCI I804:11009-000000084322
- DOI 10.23186/korea.000000084322.11009.0000941
- 본문언어 영어
- 제출원본 000045999117
초록/요약
In many hardware accelerator or Digital Signal Processor (DSP) designs, static random access memory (SRAM) based embedded memories and/or flip-flop based registers consume a significant portion of the area and power consumption. This dissertation shows the proposed domain wall memory (DWM) based embedded memory design for DSPs. First, DWM based embedded memories for DSP building blocks such as survivor-path memories and last-in-first-out (LIFO) in Viterbi decoder, first-in-first-out (FIFO) register files in Fast Fourier Transform (FFT) processor and bitonic sorter, and input registers of distributed arithmetic (DA) based FIR filter have been proposed. To reduce area and power consumption, the unique serial access mechanism in DWM has been applied to the regular and predictable memory access pattern in DSP. In addition, the classification for the proposed DWM architecture is also summarized. For the deep learning hardware accelerator, Convolutional Neural Network (CNN) and Binarized Neural Network (BNN) convolutional layer have been also explored. By investigating the data transfer between the buffer and the core, the DWM-based designs have been proposed to reduce it.
more목차
Abstract ············································································· i
List of Figures····································································· vi
List of Tables ······································································ xi
Chapter 1 Introduction ····························································1
1.1 Research Motivation·····································································1
1.2 Background ···············································································2
1.3 Thesis Outline ············································································4
Chapter 2 Exploiting Serial Access and Asymmetric Read/Write of Domain Wall Memory for Area and Energy-Efficient Digital Signal Processor Design···············································································5
2.1 Introduction ···············································································5
2.2 Basics of Domain Wall Memory·······················································9
2.2.1 DWM Fundamentals ·····························································9
2.2.2 Latency, Power, and Area······················································ 10
2.3 DSP Memories with Sequential Access············································· 14
2.3.1 Survivor Memory of Viterbi Decoder········································ 14
2.3.2 LIFO Block of Viterbi Decoder··············································· 17
2.3.3 FIFO Memory of Pipeline FFT Processor ·································· 19
2.3.4 FIFO Memory of Pipeline Bitonic Sorter ··································· 20
2.3.5 Input Register of DA-based FIR Filter ······································ 22
2.4 DWM based Embedded DSP Design ··············································· 24
2.4.1 Proposed Approach ····························································· 24
2.4.2 Implementation Details ························································ 25
2.4.2.1 DWM-based survivor memory (DW-SM) for Viterbi decoder ········ 25
2.4.2.2 DWM-based LIFO (DW-LIFO) for Viterbi decoder ···················· 30
2.4.2.3 DWM-based FIFO (DW-FIFO) for pipeline FFT processor/bitonic sorter····················································································· 32
2.4.2.4 DWM-based DA Input Register (DW-DAIR)···························· 34
2.4.2.5 Cell Characterization of DWM············································· 34
2.4.2.6 DWM memory classifications ·············································· 35
2.5 Numerical Results······································································ 37
2.5.1 STTRAM Characteristics······················································ 37
2.5.2 Survival Memory and LIFO Module in Viterbi decoder ·················· 39
2.5.3 FIFO Register Files in Pipeline FFT Processor ···························· 41
2.5.4 FIFO Register Files in Pipeline Bitonic Sorter ····························· 43
2.5.5 Input Register in DA-based FIR Filter ······································ 45
2.5.6 Considerations for the other NVM technologies ··························· 45
2.6 Conclusion ·············································································· 46
Chapter 3 Domain Wall Memory based Design of Deep Neural Network Convolutional Layers ··························································· 47
3.1 Introduction ············································································· 47
3.2 CNN and BNN Overview····························································· 50
3.2.1 CNN Architecture······························································· 50
3.2.1.1 CNN convolutional layer design ··········································· 50
3.2.2 BNN Architecture······························································· 54
3.2.2.1 BNN convolutional layer design ··········································· 54
3.2.2.2 XNOR-popcount in BNN convolutional layer ··························· 55
3.2.3 Comparison Results of CONV2 design in CNN and BNN··············· 55
3.3 DWM based CNN Convolutional Layer Design ································· 58
3.3.1 DWM-based CNN convolutional layer······································ 58
3.3.2 DWM-based cell array for CNN ············································· 60
3.3.2.1 DWM-based cell string ······················································ 60
3.3.2.2 Voltage reference circuit····················································· 62
3.3.2.3 3-bit flash type ADC ························································· 64
3.3.2.4 Limitation of DWM-based cell string ····································· 66
3.3.3 DWM input and weight bus ··················································· 67
3.3.3.1 Design issue of DWM bus as input buffer and weight register ········ 69
3.3.3.2 Operation of DWM input and weight bus ································ 71
3.4 DWM based BNN Convolutional Layer Design ·································· 74
3.4.1 DWM based BNN convolutional layer ······································ 74
3.4.2 DWM based cell array for BNN·············································· 74
3.4.2.1 XNOR-popcount and DWM-based cell string ··························· 75
3.4.2.2 Parallel XNOR-popcount operation based on DWM ··················· 77
3.4.2.3 DWM-based weight reorder block········································· 81
3.5 Numerical Results······································································ 82
3.5.1 CNN Implementation Results················································· 82
3.5.2 BNN Implementation Results················································· 87
3.6 Conclusion ·············································································· 89
Chapter 4 Conclusion ··························································· 90
Bibliography······································································ 92

