Ⅰ. INTRODUCTION 1 1.1 Background 1 1.2 Review of MRPrimer 3 Ⅱ. IMPLEMENTATION 6 2.1 Overview of GPrimer 6 2.2 Steps 1~3: Building Hash Maps and Processing using CPU threads 6 2.3 Step 4: Building Arrays and General Cross-Hybridization Filtering 8 2.4 Step 5: Pair Filtering and Ranking 16 Ⅲ. RESULTS 20 3.1 Experimental setup and Data sets 20 3.2 Performance Comparison with MRPrimer 21 3.3 Performance of GPrimer varying the number of GPUs 22 3.4 Effectiveness of workload balancing and streaming-copy 24 3.5 Memory usage 26 Ⅳ. CONCLUSIONS 27 Ⅴ. REFERENCES 28