DGIST Scholar: GMiner: A fast GPU-based frequent itemset mining method for large-scale data

Department of Electrical Engineering and Computer Science InfoLab 1. Journal Articles

Cited time in webofscience

Cited time in scopus

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chon, Kang Wook	-
dc.contributor.author	Hwang, Sang Hyun	-
dc.contributor.author	Kim, Min Soo	-
dc.date.available	2018-03-07T04:22:03Z	-
dc.date.created	2018-02-26	-
dc.date.issued	2018-05	-
dc.identifier.citation	Information Sciences, v.439-440, pp.19 - 38	-
dc.identifier.issn	0020-0255	-
dc.identifier.uri	http://hdl.handle.net/20.500.11750/5914	-
dc.description.abstract	Frequent itemset mining is widely used as a fundamental data mining technique. However, as the data size increases, the relatively slow performances of the existing methods hinder its applicability. Although many sequential frequent itemset mining methods have been proposed, there is a clear limit to the performance that can be achieved using a single thread. To overcome this limitation, various parallel methods using multi-core CPU, multiple machine, or many-core graphic processing unit (GPU) approaches have been proposed. However, these methods still have drawbacks, including relatively slow performance, data size limitations, and poor scalability due to workload skewness. In this paper, we propose a fast GPU-based frequent itemset mining method called GMiner for large-scale data. GMiner achieves very fast performance by fully exploiting the computational power of GPUs and is suitable for large-scale data. The method performs mining tasks in a counterintuitive way: it mines the patterns from the first level of the enumeration tree rather than storing and utilizing the patterns at the intermediate levels of the tree. This approach is quite effective in terms of both performance and memory use in the GPU architecture. In addition, GMiner solves the workload skewness problem from which the existing parallel methods suffer; as a result, its performance increases almost linearly as the number of GPUs increases. Through extensive experiments, we demonstrate that GMiner significantly outperforms other representative sequential and parallel methods in most cases, by orders of magnitude on the tested benchmarks. © 2018 The Authors	-
dc.language	English	-
dc.publisher	Elsevier BV	-
dc.title	GMiner: A fast GPU-based frequent itemset mining method for large-scale data	-
dc.type	Article	-
dc.identifier.doi	10.1016/j.ins.2018.01.046	-
dc.identifier.wosid	000428486600002	-
dc.identifier.scopusid	2-s2.0-85041725437	-
dc.type.local	Article(Overseas)	-
dc.type.rims	ART	-
dc.description.journalClass	1	-
dc.citation.publicationname	Information Sciences	-
dc.contributor.nonIdAuthor	Chon, Kang Wook	-
dc.contributor.nonIdAuthor	Hwang, Sang Hyun	-
dc.identifier.citationVolume	439-440	-
dc.identifier.citationStartPage	19	-
dc.identifier.citationEndPage	38	-
dc.identifier.citationTitle	Information Sciences	-
dc.type.journalArticle	Article	-
dc.description.isOpenAccess	Y	-
dc.subject.keywordAuthor	Frequent itemset mining	-
dc.subject.keywordAuthor	Graphics processing unit	-
dc.subject.keywordAuthor	Parallel algorithm	-
dc.subject.keywordAuthor	Workload skewness	-
dc.subject.keywordPlus	ALGORITHM	-
dc.contributor.affiliatedAuthor	Chon, Kang Wook	-
dc.contributor.affiliatedAuthor	Hwang, Sang Hyun	-
dc.contributor.affiliatedAuthor	Kim, Min Soo	-

Files in This Item:: There are no files associated with this item.

Appears in Collections:: Department of Electrical Engineering and Computer Science InfoLab 1. Journal Articles

Show Simple Item Record

qrcode

DGIST

DGIST Scholar was built with support from the OAK distribution project by the National Library of Korea.

You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Library Services Team, DGIST 333. Techno Jungang-daero, Hyeonpung-myeon, Dalseong-gun, Daegu, 42988, Republic of Korea.

RSS_1.0 RSS_2.0 ATOM_1.0

DGIST Library Repository

BROWSE

DGIST

BROWSE