Communities & Collections
Researchers & Labs
Titles
DGIST
LIBRARY
DGIST R&D
Detail View
Department of Electrical Engineering and Computer Science
Image Processing Laboratory
1. Journal Articles
Deep Block Transform for Autoencoders
Jin, Kyong Hwan
Department of Electrical Engineering and Computer Science
Image Processing Laboratory
1. Journal Articles
Citations
WEB OF SCIENCE
Citations
SCOPUS
Metadata Downloads
XML
Excel
Title
Deep Block Transform for Autoencoders
Issued Date
2021-05
Citation
Jin, Kyong Hwan. (2021-05). Deep Block Transform for Autoencoders. IEEE Signal Processing Letters, 28, 1016–1019. doi: 10.1109/LSP.2021.3082031
Type
Article
Author Keywords
autoencoder
;
Block transform
;
convolutional neural network
;
image representation
Keywords
Zero frequency
;
Learning systems
;
Convolution
;
Signal encoding
;
Adjoint operators
;
Block transforms
;
Blurry images
;
Convolution kernel
;
Dictionary learning
;
High resolution
;
Sliding Window
ISSN
1070-9908
Abstract
We discover that a trainable convolution layer with a stride over 1 and kernel ≥ stride is identical to a trainable block transform. A block transform is performed when we use a convolution layer with a stride ≥ 2 and a kernel ≥ the stride. For instance, if we use the same widths, such as a 2 × 2 convolution kernel and stride-2, there are no overlaps between sliding windows, so this layer operates a block transform on the partitioned 2 × 2 blocks. A block transform reduces the computational complexity due to a stride ≥ 2. To keep the original size, we apply a transposed convolution (stride = kernel ≥ 2), an adjoint operator of a forward block transform. Based on this relationship, we propose a trainable multi-scale block transform for autoencoders. The proposed method has an encoder consisting of two sequential convolutions with stride-2, a 2× 2 kernel, and a decoder consisting of the encoder's two adjoint operators (transposed convolution). Clipping is used for nonlinear activations. Inspired by the zero-frequency element in the dictionary learning method, the proposed method uses DC values for residual learning. The proposed method shows high-resolution representations, whereas the stride-1 convolutional autoencoder with 3 × 3 kernels generates blurry images. © 1994-2012 IEEE.
URI
http://hdl.handle.net/20.500.11750/15400
DOI
10.1109/LSP.2021.3082031
Publisher
Institute of Electrical and Electronics Engineers
Show Full Item Record
File Downloads
There are no files associated with this item.
공유
공유하기
Total Views & Downloads