Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chon, Kang-Wook | - |
dc.contributor.author | Kim, Min-Soo | - |
dc.date.accessioned | 2018-03-07T04:21:56Z | - |
dc.date.available | 2018-03-07T04:21:56Z | - |
dc.date.created | 2018-02-26 | - |
dc.date.issued | 2018-09 | - |
dc.identifier.citation | Cluster Computing, v.21, no.3, pp.1507 - 1520 | - |
dc.identifier.issn | 1386-7857 | - |
dc.identifier.uri | http://hdl.handle.net/20.500.11750/5910 | - |
dc.description.abstract | Frequent itemset mining is widely used as a fundamental data mining technique. Recently, there have been proposed a number of MapReduce-based frequent itemset mining methods in order to overcome the limits on data size and speed of mining that sequential mining methods have. However, the existing MapReduce-based methods still do not have a good scalability due to high workload skewness, large intermediate data, and large network communication overhead. In this paper, we propose BIGMiner, a fast and scalable MapReduce-based frequent itemset mining method. BIGMiner generates equal-sized sub-databases called transaction chunks and performs support counting only based on transaction chunks and bitwise operations without generating and shuffling intermediate data. As a result, BIGMiner achieves very high scalability due to no workload skewness, no intermediate data, and small network communication overhead. Through extensive experiments using large-scale datasets of up to 6.5 billion transactions, we have shown that BIGMiner consistently and significantly outperforms the state-of-the-art methods without any memory problems. © 2018 Springer Science+Business Media, LLC, part of Springer Nature | - |
dc.language | English | - |
dc.publisher | Springer New York LLC | - |
dc.title | BIGMiner: a fast and scalable distributed frequent pattern miner for big data | - |
dc.type | Article | - |
dc.identifier.doi | 10.1007/s10586-018-1812-0 | - |
dc.identifier.wosid | 000457275200004 | - |
dc.identifier.scopusid | 2-s2.0-85041818619 | - |
dc.type.local | Article(Overseas) | - |
dc.type.rims | ART | - |
dc.description.journalClass | 1 | - |
dc.citation.publicationname | Cluster Computing | - |
dc.contributor.nonIdAuthor | Chon, Kang-Wook | - |
dc.identifier.citationVolume | 21 | - |
dc.identifier.citationNumber | 3 | - |
dc.identifier.citationStartPage | 1507 | - |
dc.identifier.citationEndPage | 1520 | - |
dc.identifier.citationTitle | Cluster Computing | - |
dc.type.journalArticle | Article | - |
dc.description.isOpenAccess | N | - |
dc.subject.keywordAuthor | Big data | - |
dc.subject.keywordAuthor | Distributed algorithm | - |
dc.subject.keywordAuthor | Frequent pattern mining | - |
dc.subject.keywordAuthor | MapReduce | - |
dc.subject.keywordAuthor | Scalable algorithm | - |
dc.contributor.affiliatedAuthor | Chon, Kang-Wook | - |
dc.contributor.affiliatedAuthor | Kim, Min-Soo | - |
There are no files associated with this item.