Feature Pyramid Pooling and Multi-Scale Context Aggregation for Robust Object Detection
Feature Pyramid Pooling and Multi-Scale Context Aggregation for Robust Object Detection
- 주제(키워드) Object detection
- 발행기관 고려대학교 대학원
- 지도교수 고성제
- 발행년도 2019
- 학위수여년월 2019. 2
- 학위구분 박사
- 학과 대학원 전기전자전파공학과
- 원문페이지 85 p
- 실제URI http://www.dcollection.net/handler/korea/000000083142
- UCI I804:11009-000000083142
- DOI 10.23186/korea.000000083142.11009.0000930
- 본문언어 영어
- 제출원본 000045978675
초록/요약
Conventional convolutional neural network (CNN)-based feature pyramid (FP) network gradually increases the number of feature layers with a pyramidal shape instead of using a featurized image pyramid. However, the semantic gap between the CNN layers may limit the performance of object detection, especially on small objects. In this dissertation, a novel CNN-based wide FP network architecture using multi-scale context aggregation for robust object detection, referred to as a parallel FP network (PFPNet), where the FP is constructed by widening the network width instead of increasing the network depth. First, I employ the spatial pyramid pooling and some additional feature transformations to construct a pool of feature maps with different spatial sizes. In the PFPNet, the additional feature transformation is performed in parallel, which makes the feature maps with similar levels of semantic information across the scales. Then, I resize the elements of the feature pool to a uniform size and aggregate their contextual information to generate each level of the final FP. Extensive experimental results verify that the PFPNet shows state-of-the-art performance on the publicly available object detection dataset.
more목차
1. Introduction 1
2. Related Works 6
2.1 Region-Based OD Methods 6
2.2 Region-Free OD Methods 7
2.3 Feature Maps Based on SPP 8
3. Proposed Method 9
3.1 Motivation 9
3.2 Feature Maps by Different Type of FPs 11
3.3 Base Network 13
3.4 Bottleneck Feature Layer 13
3.5 FP Pool 15
3.6 MSCA 16
3.7 Prediction Subnet 22
3.8 Anchors and Details of PFPNet 22
3.9 Optimization 24
4. Experiments 26
4.1 Datasets 26
4.2 Compared OD Methods 27
4.3 PASCAL VOC 2007 29
4.3.1 Experimental Setup 29
4.3.2 Ablation Study 1: Hyper-Parameters 31
4.3.3 Ablation Study 2: Wide FP and MSCA Module 31
4.3.4 Performance Comparisons with Other FPs 33
4.3.5 Results 34
4.4 PASCAL VOC 2012 36
4.5 MS COCO 37
4.6 Speed vs Accuracy Trade-off 54
5. Conclusions 55
References 57
Bibliography 58
Abbreviations 65
Abstract 67

