DGIST Scholar: Low-resolution Image Recognition using Privileged Information

Department of Electrical Engineering and Computer Science Theses Master

Cited time in webofscience

Cited time in scopus

Low-resolution Image Recognition using Privileged Information

Title: Low-resolution Image Recognition using Privileged Information

Alternative Title: 우월한 정보를 이용한 저해상도 영상 인식

Author(s): Jaeyong Park

DGIST Authors: Kwak, Suha ; Park, Jaeyong ; Cho, Sunghyun

Advisor: 조성현

Co-Advisor(s): Suha Kwak

Issued Date: 2019

Awarded Date: 2019-02

Type: Thesis

Abstract: Image recognition on small size image is very challenging problem. Not like high-resolution (HR) images, small images contain relatively less detail which is helpful to identify objects in images. Most existing convolutional neural network base works adopt an additional super-resolution (SR) module to increase the resolution of LR image and restore the detail which is available in HR. However state-of-the-art SR models are very deep and heavy. The combination of a SR model and a recognition model needs lots of memory spaces and plenty of time to train to use it on test time.
Herein I present an effective learning procedure to train the low-resolution (LR) image recognition network. I adopt teacher-student training scheme let the network outperforMaster with ground-truth and extra supervision from teacher network, the HR image recognition network. I consider teacher network’s prediction as privileged information, which is not available on test time. Like human teacher and student, teacher network can help student network’s training by transfer its knowledge. By using the transferred knowledge from the teacher network, the trained student network perforMaster better than the solely trained LR image recognition network. |저해상도 영상 인식은 매우 어려운 문제이다. 고해상도 영상과는 다르게 작은 영상은 상대적으로 적은 세부 정보를 가지고 있어 영상 분별에 어려움을 겪게 된다. 기존의 합성곱 신경망을 활용한 저해상도 영상 인식 연구들은 초해상도 처리를 하여 해상도를 높여 세부 정보를 복원하는 것이 대부분이었다. 하지만 최근에 쓰이는 좋은 성능의 초해상도 모델들은 깊은 형태의 구조를 취하고 있고 연산도 복잡한 경향이 있다. 따라서 이러한 초해상도 모델과 영상 인식 모델을 함께 활용하는 것은 더욱 많은 연산 공간을 필요로 하고 학습에도 적잖이 신경을 써야하기 때문에 좋은 방법이 아니다.
이에 우리는 더욱 효과적인 저해상도 영상 인식 네트워크 학습 방법을 제안한다. 우리는 교사로 표현되는 고해상도 영상 인식 모델을 학생으로 표현되는 저해상도 영상 인식 모델을 학습하는데 활용하여 추가적인 학습 정보를 활용한다. 마치 사람이 교사의 지도와 정답지를 통한 자가 학습을 하듯이 우리의 학습 방법에서도 두가지 학습을 동시에 진행하도록 간결하게 구현하였으며 우리가 제시한 기준 성능을 뛰어넘은 성과를 보였다. 고해상도 영상 인식 모델은 고해상도 영상 가용이 가능한 학습 단계에만 활용되며 이에 기존 추가적인 초해상도 모델을 활용할 때 생기는 여러 문제들을 극복하였다.

Table Of Contents: I. INTRODUCTION 10

II. RELATED WORK 13

2.1 CONVOLUTIONAL NEURAL NETWORKS 13

2.2 LOW-RESOLUTION IMAGE SUPER RESOLUTION FOR IMAGE RECOGNITION 13

2.3 STAGED LEARNING 16

2.4 LEARNING USING PRIVILEGED INFORMATION 17

2.5 TRANSFER LEARNING 18

III. PROPOSED PROCEDURE 21

3.1 OVERALL 21

3.2 LEARNING USING PRIVILEGED INFORMATION 21

3.3 KNOWLEDGE DISTILLATION LOSS 21

3.4 IMPLEMENTATION DETAILS 22

3.4.1 ARCHITECTURE OF THE NETWORK: 22

3.4.2 TRAINING SETTING 24

IV. EXPERIMENTAL RESULTS 26

4.1 DATASETS 26

4.2 CALTECH-UCSD BIRDS 200-2011 DATASET 27

4.3 PASCAL VOC 2012 CLASSIFICATION DATASET 28

V. FAILURE CASE 30

VI. DISCUSSION 32

VII. CONCLUSION 33

VIII. REFERENCES 34

요 약 문 36