DGIST Scholar: Deep Block Transform for Autoencoders

Cited time in webofscience

Cited time in scopus

Deep Block Transform for Autoencoders

Author Keywords: autoencoder ; Block transform ; convolutional neural network ; image representation

Keywords: Zero frequency ; Learning systems ; Convolution ; Signal encoding ; Adjoint operators ; Block transforms ; Blurry images ; Convolution kernel ; Dictionary learning ; High resolution ; Sliding Window

Abstract: We discover that a trainable convolution layer with a stride over 1 and kernel ≥ stride is identical to a trainable block transform. A block transform is performed when we use a convolution layer with a stride ≥ 2 and a kernel ≥ the stride. For instance, if we use the same widths, such as a 2 × 2 convolution kernel and stride-2, there are no overlaps between sliding windows, so this layer operates a block transform on the partitioned 2 × 2 blocks. A block transform reduces the computational complexity due to a stride ≥ 2. To keep the original size, we apply a transposed convolution (stride = kernel ≥ 2), an adjoint operator of a forward block transform. Based on this relationship, we propose a trainable multi-scale block transform for autoencoders. The proposed method has an encoder consisting of two sequential convolutions with stride-2, a 2× 2 kernel, and a decoder consisting of the encoder's two adjoint operators (transposed convolution). Clipping is used for nonlinear activations. Inspired by the zero-frequency element in the dictionary learning method, the proposed method uses DC values for residual learning. The proposed method shows high-resolution representations, whereas the stride-1 convolutional autoencoder with 3 × 3 kernels generates blurry images. © 1994-2012 IEEE.