Detail View

A Memory-Efficient and Scalable GPU-Based Tucker Decomposition Method for Large-Scale Tensors
Citations

WEB OF SCIENCE

Citations

SCOPUS

Metadata Downloads

Title
A Memory-Efficient and Scalable GPU-Based Tucker Decomposition Method for Large-Scale Tensors
Alternative Title
대규모 텐서를 위한 메모리 효율적이고 확장 가능한 GPU 기반의 Tucker 분해 방법
DGIST Authors
Jihye LeeSungjin LeeMin-Soo Kim
Advisor
이성진
Co-Advisor(s)
Min-Soo Kim
Issued Date
2025
Awarded Date
2025-02-01
Citation
Jihye Lee. (2025). A Memory-Efficient and Scalable GPU-Based Tucker Decomposition Method for Large-Scale Tensors. doi: 10.22677/THESIS.200000841755
Type
Thesis
Description
Tensor decomposition, Big data, GPU, Scalable algorithm, Memory-efficient method
Table Of Contents
1. Introduction 1
1.1 Motivation and objectives 1
1.2 Main contributions 6
1.3 Structure of thesis 7
2. Background 10
2.1 Notations 10
2.2 Tensor operations 11
2.2.1 Fiber and Slice 11
2.2.2 Frobenius Norm 12
2.2.3 Matricization 12
2.2.4 n-mode product 13
2.3 Tucker decomposition methods 15
2.3.1 Methods for Tucker decomposition 16
2.3.2 Differences between HOSVD and HOOI 18
2.3.3 Challenges and solutions in computing Tucker decomposition 19
2.3.4 Row-wise update rules 20
3. Large-Scale GPU-Based Tucker Decomposition Using Tensor Partitioning 22
3.1 GPUTucker 22
3.1.1 Overview of GPUTucker 22
3.1.2 Tensor partitioning technique 24
3.1.3 Optimization of tensor partitioning 27
3.1.4 GPU-based data pipeline 29
3.2 Exploitation of Multiple GPUs 35
3.2.1 Core tensor sharing scheme 35
3.2.2 Advantages of the core tensor sharing scheme 37
3.3 Space and Time Cost Analysis of GPUTucker 39
3.3.1 Space cost 39
3.3.2 Time cost 41
3.4 Experiments 44
3.4.1 Experimental setup 44
3.4.2 Comparison with SOTA methods 46
3.4.3 Varying characteristics of datasets 49
3.4.4 Varying configurations of GPUs 51
3.4.5 Varying partition parameters 52
4. A Memory-Efficient and Flexible GPU-based Tucker Decomposition 56
4.1 FLICO 56
4.1.1 Linearization of high-dimensional tensors 57
4.1.2 Ordering of tensor linearization 58
4.1.3 Eliminating redundant computations 58
4.2 Experiments 60
4.2.1 Experimental setup 61
4.2.2 Comparison with SOTA methods 62
4.2.3 Varying ordering of tensor linearization 63
5. Related work 66
5.1 Multi-threaded Tucker decomposition methods 66
5.2 Distributed Tucker decomposition methods 68
5.3 GPU-based Tucker decomposition methods 69
6. Conclusions 71
URI
http://hdl.handle.net/20.500.11750/57991
http://dgist.dcollection.net/common/orgView/200000841755
DOI
10.22677/THESIS.200000841755
Degree
Doctor
Department
Department of Electrical Engineering and Computer Science
Publisher
DGIST
Show Full Item Record

File Downloads

  • There are no files associated with this item.

공유

qrcode
공유하기

Total Views & Downloads