1. Big Data Systems
One of our major research areas is big data systems for various kinds of data including relations, graphs, and scientific arrays. For relational data, we mainly focus on query optimization technqiues on Hadoop. For graph data, we have proposed a GPU-based graph processing method called GStream, which shows an extremely fast processing speed of 1,400 MTEPS only using a single PC equipped with two GPUs. For scientific data, we are developing a new scalable method for efficient query processing of NASA's Satellite data on a distributed system. In terms of operations, we are doing research on not only database operations, but also data mining operations. Our ultimate goal is developing a big data system that can process complex operations on a mixture of large-scale multi-typed data in a highly optimized way, especially on a cluster of computers or HPC(e.g., supercomputer) equipped with GPUs and high-speed network.
2. Large-scale Genomics & Proteomics
R&D of the modern biotechnology heavily relies on large-scale bio data processing. We especially focus on genomics for DNA data and proteomics for protein data. For genomics, we are developing a MapReduce-based method for designing a set of short sequences called primers that can amplify target DNA sequences in PCR experiments. In addition, we are developing a new large-scale method that can detect deseases such as cancer using WGS or WES data generated by NGS technology. For proteomics, we focus on developing a new approach for accurate and fast analysis of tandem mass spectra data with considering protein modifications.
3. Large-scale Deep Learning
In recent years, deep learning models such as CNN and DBN have emerged as a new and powerful paradigm for pattern recognition and classification. We are doing research on a scale-out deep learning system exploiting the computing power of GPUs. We are also interested in developing a deep interpretation model of multi-channel signals, especially invasive brain signal such ECoG. Currently, we focus on interpretation of real-time 32-channel signals through medical experiments using mouse as animal model.

Advisor Professor :Kim, Min-Soo
InfoLab Homepage