Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
CTRL: Closed-Loop Transcription to an LDR via Minimaxing Rate Reduction
oleh: Xili Dai, Shengbang Tong, Mingyang Li, Ziyang Wu, Michael Psenka, Kwan Ho Ryan Chan, Pengyuan Zhai, Yaodong Yu, Xiaojun Yuan, Heung-Yeung Shum, Yi Ma
Format: | Article |
---|---|
Diterbitkan: | MDPI AG 2022-03-01 |
Deskripsi
This work proposes a new computational framework for learning a structured generative model for real-world datasets. In particular, we propose to learn <i>a <b>C</b>losed-loop <b>Tr</b>anscription</i>between a multi-class, multi-dimensional data distribution and a <i><b>L</b>inear discriminative representation</i> (<i>CTRL</i>) in the feature space that consists of multiple independent multi-dimensional linear subspaces. In particular, we argue that the optimal encoding and decoding mappings sought can be formulated as a <i>two-player minimax game between the encoder and decoder</i>for the learned representation. A natural utility function for this game is the so-called <i>rate reduction</i>, a simple information-theoretic measure for distances between mixtures of subspace-like Gaussians in the feature space. Our formulation draws inspiration from closed-loop error feedback from control systems and avoids expensive evaluating and minimizing of approximated distances between arbitrary distributions in either the data space or the feature space. To a large extent, this new formulation unifies the concepts and benefits of Auto-Encoding and GAN and naturally extends them to the settings of learning a <i>both discriminative and generative</i> representation for multi-class and multi-dimensional real-world data. Our extensive experiments on many benchmark imagery datasets demonstrate tremendous potential of this new closed-loop formulation: under fair comparison, visual quality of the learned decoder and classification performance of the encoder is competitive and arguably better than existing methods based on GAN, VAE, or a combination of both. Unlike existing generative models, the so-learned features of the multiple classes are structured instead of hidden: different classes are explicitly mapped onto corresponding <i>independent principal subspaces</i> in the feature space, and diverse visual attributes within each class are modeled by the <i>independent principal components</i> within each subspace.