Reducing the Dimensionality of Data with Neural Networks¶
Why this mattered¶
TBD
Abstract¶
High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors. Gradient descent can be used for fine-tuning the weights in such "autoencoder" networks, but this works well only if the initial weights are close to a good solution. We describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.
Related¶
- cite → A Global Geometric Framework for Nonlinear Dimensionality Reduction — The neural autoencoder paper compares deep learned embeddings with Isomap's global-geodesic approach to nonlinear dimensionality reduction.
- cite → Nonlinear Dimensionality Reduction by Locally Linear Embedding — The autoencoder paper contrasts its neural-network embedding with locally linear embedding as another nonlinear manifold-learning method for dimensionality reduction.
- cite → Neural networks and physical systems with emergent collective computational abilities. — The autoencoder paper builds on Hopfield-style neural networks as an earlier demonstration that distributed neural systems can store and compute with collective representations.
- cite → A Fast Learning Algorithm for Deep Belief Nets — The autoencoder paper uses deep belief net pretraining from Hinton et al. to initialize deep autoencoders before fine-tuning them for dimensionality reduction.
- cite → Indexing by latent semantic analysis — The autoencoder paper compares learned low-dimensional codes with latent semantic analysis as a linear dimensionality-reduction baseline for document data.
- enables → Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups — Deep autoencoder pretraining showed how to initialize multilayer networks, helping revive deep architectures for speech recognition.
- enables → Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks — Neural-network dimensionality reduction via autoencoders helped establish learned latent representations later exploited by CycleGAN for unpaired image translation.
- enables → Human-level control through deep reinforcement learning — Deep autoencoder work helped establish multilayer neural representations that made deep learning practical for later deep reinforcement learning systems.
- cite ← Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups — The speech-recognition DNN paper cites deep autoencoders as evidence that multilayer neural networks can learn compact representations.
- cite ← Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks — CycleGAN cited deep autoencoders as prior work on learning compact latent representations for image data.
- cite ← Human-level control through deep reinforcement learning — The DQN paper cites deep autoencoders as evidence that multilayer neural networks can learn compact representations from high-dimensional inputs.
- enables ← A Global Geometric Framework for Nonlinear Dimensionality Reduction — Isomap enables deep autoencoder dimensionality reduction by establishing nonlinear manifold learning as a benchmark for preserving global geometry.
- enables ← Nonlinear Dimensionality Reduction by Locally Linear Embedding — Locally linear embedding enables deep autoencoder dimensionality reduction by defining local-neighborhood reconstruction as a nonlinear manifold-learning objective.
- enables ← Neural networks and physical systems with emergent collective computational abilities. — Hopfield networks enable deep autoencoder dimensionality reduction by showing neural networks can learn distributed energy-based representations.
- enables ← Indexing by latent semantic analysis — Latent semantic analysis enables deep autoencoder dimensionality reduction by demonstrating that low-dimensional continuous embeddings can capture semantic structure.