DGIST Scholar: Bandit Parameter Estimation

Department of Electrical Engineering and Computer Science Theses Master

Cited time in webofscience

Cited time in scopus

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Kwak, Su Ha	-
dc.contributor.author	Cha, Sungmin	-
dc.date.accessioned	2018-03-14T02:03:19Z	-
dc.date.available	2018-03-14T02:03:19Z	-
dc.date.issued	2018	-
dc.identifier.uri	http://dgist.dcollection.net/common/orgView/200000007744	en_US
dc.identifier.uri	http://hdl.handle.net/20.500.11750/6021	-
dc.description.abstract	Contextual bandit is useful algorithm for the recommendation task in many applications such as NETFLEX, Amazon Echo, etc. Many algorithms are researched and showed a good result in terms of high total reward or low regret. However, when user wants to receive a recommendation in the new task, these algorithms do not use information that learned from before task. We suggest new topic, Bandit Parameter Estimation, to solve that inefficient problem. In the same setting with Contextual bandit, we consider as user’s latent profile. And then we propose some algorithms to estimate as fast as possible. We conducted to experiment to verify algorithms that we proposed in two case by using a synthetic dataset. As a result of experiment, we found that our algorithm estimates parameters faster than other algorithms in Contextual bandit. ⓒ 2017 DGIST	-
dc.description.statementofresponsibility	open	-
dc.description.tableofcontents	Ⅰ. Introduction 1-- 1.1 Overview 1-- 1.2 Background 2-- 1.2.1 Multi-Armed bandit 2-- 1.2.2 K-armed (Linear) Contextual bandit 3-- 1.3 Related work 4-- 1.3.1 algorithm 4-- 1.3.2 UCB 5-- 1.3.3 LinUCB 6-- Ⅱ. Materials 8-- 2.1 Problem setting for Bandit Parameter Estimation 8-- 2.2 The uncertainty ellipsoid of 𝚹_(*) 9-- 2.2.1 𝑀𝑎𝑥(𝑚𝑖𝑛𝐸𝑖𝑔. 𝑣𝑎𝑙) 10-- 2.2.2 𝑀𝑎𝑥(𝑇𝑟(Σ_(t))) 11-- 2.2.3 𝑀𝑖𝑛(𝑇𝑟(Σ_(t)^(-1))) 11-- 2.2.4 Max(Det(Σ_(t))) 12Ⅲ. Method 13-- 3.1 Generating synthetic data 13-- 3.2 The experiment process 13-- Ⅳ. Experimental result 14-- 4.1 The experiment case 1 : Various k, fixed d 14-- 4.2 The experiment case 2 : Various d, fixed k 15-- Ⅴ. Discussion 17-- 5.1 Conclusion 17-- 5.2 Future work 17-- Reference 18-- Summary (Korean) 19	-
dc.format.extent	19	-
dc.language	eng	-
dc.publisher	DGIST	-
dc.subject	Recommendation	-
dc.subject	Bandit	-
dc.subject	Contextual Bandit	-
dc.subject	Parameter Estimation	-
dc.title	Bandit Parameter Estimation	-
dc.type	Thesis	-
dc.identifier.doi	10.22677/thesis.200000007744	-
dc.description.alternativeAbstract	최근 많은 애플리케이션에서 사용자 맞춤형 추천을 제공하고 있다. 이때 주로 사용되는 알고리즘은 Contextual Bandit이라는 형태로 이미 많은 연구가 진행되어 좋은 결과를 보여주고 있다. 하지만 이 알고리즘들은 특정 유저에 대해서 하나의 Task에서는 빠르게 사용자에게 맞는 추천을 제공하고 있으나 만약 새로운 Task에 대해 추천을 제공해야 할 때, 이전 Task에서 학습한 정보를 이용하지 못하고 Task 별로 독립적으로 다시 학습해야 하므로 효율적이지 않다. 이러한 점에서 동기를 얻어 Contextual Bandit과 같은 환경에서 최근 사용자의 프로필을 학습하기 위한 Bandit Parameter Estimation이라는 형태의 새로운 문제를 제시하였다. 빠른 학습을 위하여 The uncertainty ellipsoid을 수축하기 위한 몇 가지 알고리즘을 제시하였고 실험을 위해 만든 데이터 셋에서 제시한 알고리즘이 기존의 Contextual bandit 알고리즘보다 빠르게 Parameter Estimation을 수행하는 것을 확인했다. 또한 향후 연구 주제로 본 논문을 통해 확인된 알고리즘을 실제 데이터에 적용하여 알고리즘을 검증하는 것 그리고 학습된 사용자의 프로필을 추가적으로 이용하여 Contextual Bandit에 사용되는 알고리즘을 사용했을 때 프로필을 사용하지 않았을 때 보다 더 빠르게 좋은 추천을 제공하는지 확인하는 연구가 필요하다는 것을 제시하였다.	-
dc.description.degree	Master	-
dc.contributor.department	Information and Communication Engineering	-
dc.contributor.coadvisor	Moon, Tae Sup	-
dc.date.awarded	2018. 2	-
dc.publisher.location	Daegu	-
dc.description.database	dCollection	-
dc.date.accepted	2018-01-05	-
dc.contributor.alternativeDepartment	대학원 정보통신융합전공	-
dc.contributor.alternativeName	차성민	-
dc.contributor.alternativeName	곽수하	-
dc.contributor.alternativeName	문태섭	-

Files in This Item:: 정보통신_차성민.pdf
기타 데이터 / 1.79 MB / Adobe PDF download

Appears in Collections:: Department of Electrical Engineering and Computer Science Theses Master

Show Simple Item Record

qrcode

DGIST

DGIST Scholar was built with support from the OAK distribution project by the National Library of Korea.

You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Library Services Team, DGIST 333. Techno Jungang-daero, Hyeonpung-myeon, Dalseong-gun, Daegu, 42988, Republic of Korea.

DGIST Library Repository

BROWSE

DGIST

BROWSE