DGIST Scholar: An Effective and Efficient Method for Tweaking Deep Neural Networks

Department of Electrical Engineering and Computer Science Theses Ph.D.

Cited time in webofscience

Cited time in scopus

An Effective and Efficient Method for Tweaking Deep Neural Networks

Title: An Effective and Efficient Method for Tweaking Deep Neural Networks

Alternative Title: 심층 신경망을 조정하기 위한 효과적이고 효율적인 방법

Author(s): Jinwook Kim

DGIST Authors: Kim, Jinwook ; Kim, Daehoon ; Kim, Min-Soo

Advisor: 김대훈

Co-Advisor(s): Min-Soo Kim

Issued Date: 2021

Awarded Date: 2021/08

Type: Thesis

Subject: Deep neural networks, synaptic join, network augmentation

Description: Deep neural networks, synaptic join, network augmentation

Abstract: 오늘날 딥러닝 모델의 훈련은 주어진 훈련 데이터를 이용하여 학습하고 검증 데이터에 대해 모든 클래스들에 대한 평균(overall) 정확도가 높은 결과가 나오는 방향으로 이루어진다.
즉, 특정 클래스들의 정확도가 매우 나쁜 결과가 나오더라도, 클래스 상관없이 최대한 많은 검증 샘플들에 대해 정답을 맞힘으로써 평균 정확도를 향상시킬 수 있는 방향으로 최적화가 진행된다.
저조한 정확도를 가지는 특정 클래스를 개선시키기 위한 추가 훈련을 진행하더라도, 학습이 진행됨에 따라 각 클래스별 정확도의 변동이 매우 크다.
즉 어느 지점에서 훈련을 중단시키더라도 평균 정확도보다 현저히 정확도가 떨어지는 클래스들이 발생한다.
이는 현재의 딥 러닝 기술로서 피할 수 없는 문제이다.
특히 의료 인공지능 시스템 등과 같이 특정 클래스의 정확도가 중요한 응용에서는 이러한 문제가 치명적일 수 있다.
이를 위해서 평균 정확도를 유지하면서 응용에서 중요한 목표(target) 클래스의 정확도를 개선시킬 수 있는 방법이 필요하다.

본 학위논문의 첫 번째 부분에서는 이미 학습된 딥러닝 모델에 대해 추가 학습을 진행하는 것이 아닌 사용자가 개선을 원하는 특정 클래스(이하 목표 클래스)의 정확도를 정밀하게 조정하는 시냅틱 조인(synaptic join) 기술을 제안한다.
제안된 시냅틱 조인은 1) 학습이 완료된 원본 모델에서 목표 클래스의 정확도 개선에 사용될 수 있으면서 동시에 비목표 클래스의 정확도에 영향을 최소화할 수 있는 활성 뉴런(active neurons)들을 찾고, 2) 활성 뉴런들을 목표 클래스를 개선하는 간선(시냅스)의 형태로 연결하는 방식이다.
제안된 방식은 원본 모델을 그대로 유지하면서 시냅스들만 추가 그리고 삭제하면서 사용자의 요구에 따라 모델을 수정할 수 있다.
또한 우리는 시냅틱 조인 연산을 다중 GPU의 한정된 메모리 상에서 빠르게 처리할 수 있는 기술을 소개한다.
재학습 방식과 비교한 실험 결과는 우리의 방법이 특정 클래스의 정확도를 더 잘 컨트롤 할 수 있으며 또한 효과적으로 향상시킬 수 있음을 보여준다.

본 학위논문의 두 번째 부분에서는 시냅틱 조인을 통해 얻은 활성 뉴런을 활용하여 심층 신경망을 증강하는 연구를 제안하였다.
신경망 증강 기술은 신경망 모델을 확장하거나 전이학습을 하기 위해 널리 활용되는 기법 중 하나이다.
이 방식은 이미 학습이 완료된 모델에 새로운 은닉층을 추가하고 미세조정 학습을 하며, 그 결과 더 정확한 모델을 얻거나 응용 프로그램의 목적에 맞는 모델을 얻을 수 있다.
그러나 대규모 심층 신경망 모델에 레이어를 추가하면 모델의 학습에 필요한 매개 변수의 수가 많아져 학습 시간이 늘어날 수 있다.
우리는 효율적인 학습을 위해 적은 수의 활성 뉴런 만 입력 값으로 갖는 증강 네트워크를 제안한다.
또한 일반적인 신경망 증강과 달리 원본 모델의 학습 없이 증강된 모델만 효율적으로 학습하는 방식을 제안한다.
깊이 증강(depth augmented) 방식과 비교했을 때 우리 방식이 모든 실험에서 더 적은 수의 매개 변수와 빠른 학습 속도로 비슷하거나 더 나은 정확도를 얻을 수 있음을 보였다.

요약하여, 본 학위논문은 원본 심층 신경망의 변경없이 모델을 사용자의 목적에 맞게 개선할 수 있는 방법을 제안한다.
원본 모델에서 목표 클래스의 정확도 개선하면서 동시에 비 목표 클래스의 정확도에 영향을 최소화하는 시냅틱 조인 기술을 제안한다.
또한 심층 신경망 증강을 효율적으로 할 수 있는 활성 뉴런을 활용한 심층 신경망 증강 기술을 제안한다.
제안된 방법들은 사용자의 목적에 맞게 효과적이고 효율적으로 신경망의 조정하는 방법으로 맞춤형 인공지능 서비스 응용분야에 매우 유용하게 사용될 수 있을 것으로 사료된다.; Today, training of deep learning models is conducted in the direction of learning using given training data and obtaining results with high overall accuracy for all classes for the validation data.
That is, even if the accuracy of specific classes is very poor, the optimization proceeds in the direction of improving the overall accuracy by predicting correct answers for as many samples as possible regardless of class.
Even if additional training is performed to improve a specific class having poor accuracy, the accuracy of each class fluctuates greatly as the learning progresses.
That is, even if training is stopped at any point, classes that are significantly less accurate than average accuracy occur.
This is an inevitable problem with current deep learning training methods.
In particular, in applications where the accuracy of a specific class is important, such as medical artificial intelligence systems, this problem can be fatal.
To solve this problem, we need a method to improve the accuracy of the specific target class, which is important in the given application, while maintaining the overall accuracy.

The first part of this dissertation, we propose the Synaptic Join method that precisely adjusts the accuracy of a specific class\,(target class) that the user wants to improve, rather than performing additional learning on an already learned deep learning model.
The proposed synaptic join method finds active neurons that can improve the accuracy of the target class and minimize the damage on the accuracy of non-target classes.
The active neurons are connected in the form of synapses to the output neurons to improve the target class.
The proposed method can tweak the original model according to the user's request while maintaining the original model as it is.
Also, we introduce a technique that can quickly process synaptic join operations on the limited memory of multiple GPUs.
Experimental results compared to the retraining methods show that our method can better control and effectively improve the accuracy of target classes.

In the second part of this dissertation, we propose Network Augmentation with Active Neurons.
The network augmentation method is one of the widely used techniques to extend a neural network model or perform transfer learning.
The method adds a new hidden layer to the model that has already been trained and performs fine-tuning to obtain more accurate models or models that fit the purpose of given applications.
However, adding layers to a large-scale deep neural network model can increase the learning time and the number of parameters required for training the model.
We propose an augmented network with only a small number of active neurons as input values for efficient training.
In addition, unlike general network augmentation, the proposed method learns only the augmented model while maintaining the original model.
Compared to the depth augmented method, we show that our method can achieve similar or better accuracy with fewer parameters and shorter training time in all experiments.

In summary, this dissertation proposes the methods to improve the model to suit the user's purpose without changing the original deep neural network models.
For improving the accuracy of the target class in the original model while minimizing the effect on the accuracy of the non-target class, we propose Synaptic Join method.
To efficiently augment deep neural networks, we propose a network augmentation method using active neurons.
The proposed methods are effective and efficient methods of adjusting neural networks according to the user's purpose and are useful in customized artificial intelligence service applications.

Table Of Contents: Chapter 1 Introduction 13
1.1 Introduction 13
1.2 Main contributions 17
1.3 Structure of thesis 18
Chapter 2 Background 19
2.1 Deep Neural Network Models 19
2.1.1 Deep Neural Network 19
2.1.2 Convolutional Neural Networks 20
2.2 Network Augmentation 23
Chapter 3 Tweaking Deep Neural Networks 29
3.1 Simple Retraining Method 29
3.2 Synaptic Join Method 31
3.2.1 Tables for Join 31
3.2.2 Algorithm θ 32
3.2.3 Synaptic Join Method 37
3.2.4 Synaptic Retraining Method 40
3.3 Experimental evaluation 42
3.3.1 Environments 42
3.3.2 Comparison with Retraining and Relevant Methods 43
3.3.3 Quantitative Analysis 48
3.3.4 Evaluation of Different Models for the Same Data 49
3.3.5 Distribution of Synapses 50
3.3.6 Characteristics of Synaptic Join 51
3.3.7 Characteristics of Synaptic Retraining 53
3.3.8 Synaptic Join for Imbalanced Data 54
3.3.9 Time and Space Cost of Synaptic Join 56
Chapter 4 Augmenting Deep Neural Networks with Active Neurons 63
4.1 Augmentation with Active Neurons 63
4.2 Experimental Evaluation 64
4.2.1 Environments 64
4.2.2 Comparison with DA and WA Method 66
Chapter 5 Related Work 71
Chapter 6 Conclusions 74
References 76
Appendix 89
요약문 91

URI: http://dgist.dcollection.net/common/orgView/200000497154

http://hdl.handle.net/20.500.11750/16602

DOI: 10.22677/thesis.200000497154

Degree: Doctor

Department: Information and Communication Engineering

Publisher: DGIST

Related Researcher

Kim, Daehoon
Research Interests Computer Architecture and Systems; Virtualization; Cloud Computing

Files in This Item:: 200000497154.pdf
기타 데이터 / 4.02 MB / Adobe PDF download

Appears in Collections:: Department of Electrical Engineering and Computer Science Theses Ph.D.

Show Full Item Record

qrcode

DGIST

DGIST Scholar was built with support from the OAK distribution project by the National Library of Korea.

You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Library Services Team, DGIST 333. Techno Jungang-daero, Hyeonpung-myeon, Dalseong-gun, Daegu, 42988, Republic of Korea.

RSS_1.0 RSS_2.0 ATOM_1.0

DGIST Library Repository

BROWSE

DGIST

BROWSE