DGIST Scholar: A Group Based Personalized Approach to Efficient Hand Gesture Recognition Using Sensor Fusion

Department of Electrical Engineering and Computer Science Theses Master

Cited time in webofscience

Cited time in scopus

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Eun, Yong Soon	-
dc.contributor.author	Shin, Seongjoo	-
dc.date.accessioned	2018-03-14T02:03:15Z	-
dc.date.available	2018-03-14T02:03:15Z	-
dc.date.issued	2018	-
dc.identifier.uri	http://dgist.dcollection.net/common/orgView/200000008231	en_US
dc.identifier.uri	http://hdl.handle.net/20.500.11750/6019	-
dc.description.abstract	Multimodal interface keeps evolving in order to better represent people’s intention. A gesture as a type of the multimodal interface is one of the effective ways for people to express their intention. Specially, hand gesture recognition provides an eidetic and convenient way of human-machine interaction (HMI). In this thesis, we investigate the problems of dynamic hand gesture recognition and develop a Korean sign language (KSL) recognition system which can help many hearing and speech-impaired people communi-cate with the public. To recognize sign language, the system should first determine the shape of the hand and the movement of the arm. Since sign language consists of a sequence of movements, it is difficult to distinguish a certain gesture from gestures (movements). To address this problem, the recognition system has to know the beginning and end of the gesture. To get the starting and ending points, we have defined the basic posture. The sign language also has various lengths of gestures. It is effective to make the fixed length input data (gestures) rather than predefine the length of each gesture for recognition. Many attempts to study the hand gesture recognition commonly use various types of sensors such as cameras, electromyograms (EMG), glove sensors, and inertial measurement units (IMU). Inconvenience caused by their weight, the shapes uncomfortable to wear, and cumbersome calibration processes might decrease the usability of them. Wearable devices like smart watches and armbands can solve this problem. Furthermore, in order to improve recognition accuracy, the effective way is to exploit multiple heterogeneous sensors (both an EMG sensor and an IMU sensor) which can produce the redundant information to the same physical variable. It is necessary to pre-process before classification since it is important to classify the gesture using the values extracted from the sensor. We evaluated the performance of two different methods, min-max and z-score nor-malization. Specially, we focus on the fact that EMG signals depends on physical features of people because the amount of muscle and the thickness of the fat layer are different for each person. Unfortunately, in the traditional recognition technique not to consider human physical features, since a single model is applied to all users, it does not guarantee the performance in terms of accuracy. To address these issues, we create group-dependent Neural Network (NN) models based on a sensor fusion technology. Our approach on group-dependent NN models is to separate the models so that people can use different models. People are experimentally divided into several groups according to persons’ data with similarity in body features after learning. We proved that the physical similarity exists in our created models. Finally, We compare our model with models of Artificial neural networks (ANNs) including convolution neural networks (CNNs) and long short-term memory (LSTM) since the performance of those is high in the classification. The experimental results show that the proposed method has high accuracy (99.13% of CNN without dropout and 98.1% of CNN with dropout). ⓒ 2017 DGIST	-
dc.description.statementofresponsibility	open	-
dc.description.tableofcontents	I. Introduction 1-- II. Background 5-- 2.1 Hand Gesture Recognition 5-- 2.2 Sensors for Hand Gesture 6-- III. Methodology 8-- 3.1 Feature Extraction 8-- 3.2 Preprocessing and Acquisition 11-- 3.3 Creation of Architectures Using Neural Networks 13-- 3.4 Group-Dependent NN Models 14-- IV. KSL recognition system 16-- 4.1 Data Collection 16-- 4.2 Creation Group-Dependent NN Models 16-- 4.3 System Implementation 23-- V. Conclusion 24-- References 26	-
dc.format.extent	29	-
dc.language	eng	-
dc.publisher	DGIST	-
dc.subject	Korean sign language	-
dc.subject	Electromyography	-
dc.subject	Hand gesture	-
dc.subject	Sensor fusion	-
dc.subject	Artificial Neural Network	-
dc.subject	한국 수화	-
dc.subject	근전도	-
dc.subject	핸드 제스처	-
dc.subject	센서 퓨전	-
dc.subject	인공 신경망	-
dc.title	A Group Based Personalized Approach to Efficient Hand Gesture Recognition Using Sensor Fusion	-
dc.title.alternative	센서 퓨전 기반의 핸드 제스처 인식을 위한 그룹 기반의 개인 맞춤 접근	-
dc.type	Thesis	-
dc.identifier.doi	10.22677/thesis.200000008231	-
dc.description.alternativeAbstract	입력 장치 발전으로 사람들의 의사를 좀더 표현 할 수 있는 장치들이 개발 되고 있다. 이에따라 증강 현실 또는 실생활에서 사용 할 수 있는 입력 장치가 필요하다. 핸드 제스처는 사람의 의사를 가장 직관적으로 표현할 수 있는 방법이며 이 것을 기계가 인식한다면 컴퓨터와 사람간의 소통이 더욱 좋아질 것이다. 핸드 제스처는 움직임이 있는 다이나믹 제스처와 움직임이 없는 스태틱 제스처로 나뉜다. 스태틱 제스처의 경우 손의 모양에 따라 사용되는 근육이 다르기 때문에 측정 할 수 있는 근전도 센서를 이용하여 손의 모양을 추측 할 수 있다. 다이나믹 제스처의 경우 IMU 를 통하여 측정 할 수 있다. 이 두 센서를 퓨전 하여 다이나믹 핸드 제스처를 인식하고자 한다. 다이나믹 제스처는 움직임의 길이가 다르기 때문에 시작 지점과 끝 지점을 찾기 어렵다. 따라서 기본 동작을 정의함으로써 시작 지점과 끝 지점을 구분할 수 있도록 한다. 기본 동작에서 시작 지점과 끝 지점을 인식하는 방법은 IMU 의 가속도계를 사용하며, 노이즈의 영향을 줄이기 위해 최근 10 개의 데이터의 평균값을 사용한다. 이 평균값이 일정 범위 이상 벗어나면 시작 지점이라 하며, 일정 범위 안에 10 번 연속으로 들어오면 끝 지점이라 한다. 제스처의 한 동작이 범위 근처에서 동작 할 수 있기 때문이다. 다이나믹 제스처의 경우, 한 제스처의 여러 샘플의 길이가 다르기 때문에 zero padding 을 적용하여 샘플들의 길이를 동일하게 한다. EMG 센서와 IMU 센서를 같이 사용 했기 때문에, 샘플링 주기가 높은 EMG 센서를 다운 샘플링 했다. 또한 센서들의 출력되는 값의 범위가 다르기 때문에 정규화를 수행하며 우리는 2 가지 정규화 (z-score, min-max)를 수행하여 성능을 비교한다. 우리는 CNN 과 LSTM 을 구조로 가지는 모델들을 만들었으며, CNN 의 경우 dropout 을 적용한 모델과 적용하지 않은 모델을 만들었다. CNN 모델을 사용하기 위해 전처리 된 센서 값을 이미지처럼 표현했다. 사람들은 각자 다른 신체적 특성을 가지고 있기 때문에 센서로부터 측정된 값이 다를 수 있다. 이는 분류 모델의 성능 저하를 일으킬 수 있는 원인이 된다. 따라서 비슷한 사람마다 그룹을 지어 각각 다른 모델을 사용하는 방법으로 인식률을 높이고자 한다. 모든 데이터로 학습된 하나의 학습모델보다 비슷한 특성을 가지는 그룹의 데이터로 학습된 모델들 중 자신에게 맞는 모델을 사용했을 때 성능이 크게 향상 됨을 확인했다. 한국에 있는 수화를 사용하고 있는 장애인들은 수화를 모르는 일반인들과의 의사소통에 어려움을 겪는다. 따라서 수화를 인식해서 실시간으로 번역할 수 있는 시스템을 만든다. 이 시스템은 수화를 인식하고 뜻을 스피커를 이용해 알려줌으로써 일반인이 수화를 모르더라도 그 뜻을 알 수 있게 한다.	-
dc.description.degree	Master	-
dc.contributor.department	Information and Communication Engineering	-
dc.contributor.coadvisor	Son, Sang Hyuk	-
dc.date.awarded	2018. 2	-
dc.publisher.location	Daegu	-
dc.description.database	dCollection	-
dc.date.accepted	2018-01-05	-
dc.contributor.alternativeDepartment	대학원 정보통신융합전공	-
dc.contributor.alternativeName	신성주	-
dc.contributor.alternativeName	은용순	-
dc.contributor.alternativeName	손상혁	-

Files in This Item:: 정보통신_신성주.pdf
기타 데이터 / 951.75 kB / Adobe PDF download

Appears in Collections:: Department of Electrical Engineering and Computer Science Theses Master

Show Simple Item Record

qrcode

DGIST

DGIST Scholar was built with support from the OAK distribution project by the National Library of Korea.

You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Library Services Team, DGIST 333. Techno Jungang-daero, Hyeonpung-myeon, Dalseong-gun, Daegu, 42988, Republic of Korea.

DGIST Library Repository

BROWSE

DGIST

BROWSE