DGIST Scholar: Deep Block Transform for Autoencoders

Department of Electrical Engineering and Computer Science Image Processing Laboratory 1. Journal Articles

Cited time in webofscience

Cited time in scopus

Full metadata record

DC Field	Value	Language
dc.contributor.author	Jin, Kyong Hwan	-
dc.date.accessioned	2021-10-05T08:30:14Z	-
dc.date.available	2021-10-05T08:30:14Z	-
dc.date.created	2021-06-18	-
dc.date.issued	2021-05	-
dc.identifier.issn	1070-9908	-
dc.identifier.uri	http://hdl.handle.net/20.500.11750/15400	-
dc.description.abstract	We discover that a trainable convolution layer with a stride over 1 and kernel ≥ stride is identical to a trainable block transform. A block transform is performed when we use a convolution layer with a stride ≥ 2 and a kernel ≥ the stride. For instance, if we use the same widths, such as a 2 × 2 convolution kernel and stride-2, there are no overlaps between sliding windows, so this layer operates a block transform on the partitioned 2 × 2 blocks. A block transform reduces the computational complexity due to a stride ≥ 2. To keep the original size, we apply a transposed convolution (stride = kernel ≥ 2), an adjoint operator of a forward block transform. Based on this relationship, we propose a trainable multi-scale block transform for autoencoders. The proposed method has an encoder consisting of two sequential convolutions with stride-2, a 2× 2 kernel, and a decoder consisting of the encoder's two adjoint operators (transposed convolution). Clipping is used for nonlinear activations. Inspired by the zero-frequency element in the dictionary learning method, the proposed method uses DC values for residual learning. The proposed method shows high-resolution representations, whereas the stride-1 convolutional autoencoder with 3 × 3 kernels generates blurry images. © 1994-2012 IEEE.	-
dc.language	English	-
dc.publisher	Institute of Electrical and Electronics Engineers	-
dc.title	Deep Block Transform for Autoencoders	-
dc.type	Article	-
dc.identifier.doi	10.1109/LSP.2021.3082031	-
dc.identifier.scopusid	2-s2.0-85107189040	-
dc.identifier.bibliographicCitation	IEEE Signal Processing Letters, v.28, pp.1016 - 1019	-
dc.description.isOpenAccess	FALSE	-
dc.subject.keywordAuthor	autoencoder	-
dc.subject.keywordAuthor	Block transform	-
dc.subject.keywordAuthor	convolutional neural network	-
dc.subject.keywordAuthor	image representation	-
dc.subject.keywordPlus	Zero frequency	-
dc.subject.keywordPlus	Learning systems	-
dc.subject.keywordPlus	Convolution	-
dc.subject.keywordPlus	Signal encoding	-
dc.subject.keywordPlus	Adjoint operators	-
dc.subject.keywordPlus	Block transforms	-
dc.subject.keywordPlus	Blurry images	-
dc.subject.keywordPlus	Convolution kernel	-
dc.subject.keywordPlus	Dictionary learning	-
dc.subject.keywordPlus	High resolution	-
dc.subject.keywordPlus	Sliding Window	-
dc.citation.endPage	1019	-
dc.citation.startPage	1016	-
dc.citation.title	IEEE Signal Processing Letters	-
dc.citation.volume	28	-

Files in This Item:: There are no files associated with this item.

Appears in Collections:: Department of Electrical Engineering and Computer Science Image Processing Laboratory 1. Journal Articles

Show Simple Item Record

qrcode

DGIST

DGIST Scholar was built with support from the OAK distribution project by the National Library of Korea.

You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Library Services Team, DGIST 333. Techno Jungang-daero, Hyeonpung-myeon, Dalseong-gun, Daegu, 42988, Republic of Korea.

DGIST Library Repository

BROWSE

DGIST

BROWSE