Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Jang, Yongjoo | - |
dc.contributor.author | Kim, Sejin | - |
dc.contributor.author | Kim, Daehoon | - |
dc.contributor.author | Lee, Sungjin | - |
dc.contributor.author | Kung, Jaeha | - |
dc.date.accessioned | 2021-10-07T12:00:16Z | - |
dc.date.available | 2021-10-07T12:00:16Z | - |
dc.date.created | 2021-06-14 | - |
dc.date.issued | 2021-01 | - |
dc.identifier.citation | IEEE Computer Architecture Letters, v.20, no.1, pp.70 - 73 | - |
dc.identifier.issn | 1556-6056 | - |
dc.identifier.uri | http://hdl.handle.net/20.500.11750/15436 | - |
dc.description.abstract | In this paper, we present deep partitioned training to accelerate computations involved in training DNN models. This is the first work that partitions a DNN model across storage devices, an NPU and a host CPU forming a unified compute node for training workloads. To validate the benefit of using the proposed system during DNN training, a trace-based simulator or an FPGA prototype is used to estimate the overall performance and obtain the layer index to be partitioned that provides the minimum latency. As a case study, we select two benchmarks, i.e., vision-related tasks and a recommendation system. As a result, the training time reduces by 12.2~31.0% with four near-storage computing devices in vision-related tasks with a mini-batch size of 512 and 40.6~44.7% with one near-storage computing device in the selected recommendation system with a mini-batch size of 64. CCBY | - |
dc.language | English | - |
dc.publisher | Institute of Electrical and Electronics Engineers | - |
dc.title | Deep Partitioned Training from Near-Storage Computing to DNN Accelerators | - |
dc.type | Article | - |
dc.identifier.doi | 10.1109/LCA.2021.3081752 | - |
dc.identifier.wosid | 000660632200001 | - |
dc.identifier.scopusid | 2-s2.0-85106689474 | - |
dc.type.local | Article(Overseas) | - |
dc.type.rims | ART | - |
dc.description.journalClass | 1 | - |
dc.citation.publicationname | IEEE Computer Architecture Letters | - |
dc.contributor.nonIdAuthor | Jang, Yongjoo | - |
dc.contributor.nonIdAuthor | Kim, Sejin | - |
dc.identifier.citationVolume | 20 | - |
dc.identifier.citationNumber | 1 | - |
dc.identifier.citationStartPage | 70 | - |
dc.identifier.citationEndPage | 73 | - |
dc.identifier.citationTitle | IEEE Computer Architecture Letters | - |
dc.description.isOpenAccess | Y | - |
dc.subject.keywordAuthor | Computational modeling | - |
dc.subject.keywordAuthor | Data models | - |
dc.subject.keywordAuthor | DNN accelerators | - |
dc.subject.keywordAuthor | Indexes | - |
dc.subject.keywordAuthor | Kernel | - |
dc.subject.keywordAuthor | Near-storage computing | - |
dc.subject.keywordAuthor | Parallel processing | - |
dc.subject.keywordAuthor | Random access memory | - |
dc.subject.keywordAuthor | Training | - |
dc.subject.keywordAuthor | Training deep neural networks | - |
dc.subject.keywordAuthor | Workload partitioning | - |
dc.subject.keywordPlus | Virtual storage | - |
dc.subject.keywordPlus | Batch sizes | - |
dc.subject.keywordPlus | Computing devices | - |
dc.subject.keywordPlus | Fpga prototypes | - |
dc.subject.keywordPlus | Training time | - |
dc.subject.keywordPlus | Storage as a service (STaaS) | - |
dc.subject.keywordPlus | Deep neural networks | - |
dc.subject.keywordPlus | Recommender systems | - |
dc.contributor.affiliatedAuthor | Jang, Yongjoo | - |
dc.contributor.affiliatedAuthor | Kim, Sejin | - |
dc.contributor.affiliatedAuthor | Kim, Daehoon | - |
dc.contributor.affiliatedAuthor | Lee, Sungjin | - |
dc.contributor.affiliatedAuthor | Kung, Jaeha | - |
There are no files associated with this item.