Cited time in webofscience Cited time in scopus

Full metadata record

DC Field Value Language
dc.contributor.author Chon, Kang-Wook -
dc.contributor.author Kim, Min-Soo -
dc.date.accessioned 2018-03-07T04:21:56Z -
dc.date.available 2018-03-07T04:21:56Z -
dc.date.created 2018-02-26 -
dc.date.issued 2018-09 -
dc.identifier.citation Cluster Computing, v.21, no.3, pp.1507 - 1520 -
dc.identifier.issn 1386-7857 -
dc.identifier.uri http://hdl.handle.net/20.500.11750/5910 -
dc.description.abstract Frequent itemset mining is widely used as a fundamental data mining technique. Recently, there have been proposed a number of MapReduce-based frequent itemset mining methods in order to overcome the limits on data size and speed of mining that sequential mining methods have. However, the existing MapReduce-based methods still do not have a good scalability due to high workload skewness, large intermediate data, and large network communication overhead. In this paper, we propose BIGMiner, a fast and scalable MapReduce-based frequent itemset mining method. BIGMiner generates equal-sized sub-databases called transaction chunks and performs support counting only based on transaction chunks and bitwise operations without generating and shuffling intermediate data. As a result, BIGMiner achieves very high scalability due to no workload skewness, no intermediate data, and small network communication overhead. Through extensive experiments using large-scale datasets of up to 6.5 billion transactions, we have shown that BIGMiner consistently and significantly outperforms the state-of-the-art methods without any memory problems. © 2018 Springer Science+Business Media, LLC, part of Springer Nature -
dc.language English -
dc.publisher Springer New York LLC -
dc.title BIGMiner: a fast and scalable distributed frequent pattern miner for big data -
dc.type Article -
dc.identifier.doi 10.1007/s10586-018-1812-0 -
dc.identifier.wosid 000457275200004 -
dc.identifier.scopusid 2-s2.0-85041818619 -
dc.type.local Article(Overseas) -
dc.type.rims ART -
dc.description.journalClass 1 -
dc.citation.publicationname Cluster Computing -
dc.contributor.nonIdAuthor Chon, Kang-Wook -
dc.identifier.citationVolume 21 -
dc.identifier.citationNumber 3 -
dc.identifier.citationStartPage 1507 -
dc.identifier.citationEndPage 1520 -
dc.identifier.citationTitle Cluster Computing -
dc.type.journalArticle Article -
dc.description.isOpenAccess N -
dc.subject.keywordAuthor Big data -
dc.subject.keywordAuthor Distributed algorithm -
dc.subject.keywordAuthor Frequent pattern mining -
dc.subject.keywordAuthor MapReduce -
dc.subject.keywordAuthor Scalable algorithm -
dc.contributor.affiliatedAuthor Chon, Kang-Wook -
dc.contributor.affiliatedAuthor Kim, Min-Soo -
Files in This Item:

There are no files associated with this item.

Appears in Collections:
Department of Electrical Engineering and Computer Science InfoLab 1. Journal Articles

qrcode

  • twitter
  • facebook
  • mendeley

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE