Multilayer feedforward networks are universal approximators¶
Why this mattered¶
Hornik, Stinchcombe, and White’s paper helped settle a foundational question about neural networks: whether their usefulness depended on special properties of particular problems, or whether feedforward networks had a general representational capacity. The paper showed that multilayer feedforward networks with appropriate activation functions can approximate any Borel-measurable function to arbitrary accuracy, under suitable conditions. This shifted neural networks from being viewed mainly as heuristic pattern-recognition devices toward being mathematically legitimate function approximators.
The result did not show that such networks could be efficiently trained, that they would generalize well, or that a practical architecture could be found for every task. Its importance was more basic: it established that the limitation of neural networks was not, in principle, expressive power. After this, central research questions could move toward optimization, data, regularization, architecture, and computation. That distinction became crucial for later developments, from backpropagation-driven multilayer networks to deep learning systems whose success depended on scalable training rather than merely on representational possibility.
Historically, the paper sits alongside other universal approximation results from the late 1980s, including Cybenko’s theorem, but Hornik, Stinchcombe, and White gave the claim a broad and influential formulation for multilayer feedforward networks. Its legacy is that it supplied one of the mathematical pillars beneath the modern neural-network paradigm: neural networks could be understood as general-purpose approximators, making them plausible candidates for modeling complex nonlinear phenomena across vision, speech, language, control, and scientific prediction.
Abstract¶
(no abstract available)
Related¶
- cite → Approximation by superpositions of a sigmoidal function — Hornik, Stinchcombe, and White generalize Cybenko's sigmoidal-function approximation theorem to multilayer feedforward neural networks.
- enables → Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations — The universal approximation theorem enabled PINNs by justifying neural networks as flexible function approximators for PDE solution fields.
- cite ← Approximation by superpositions of a sigmoidal function — Both papers prove universal approximation for multilayer neural networks with sigmoidal nonlinearities.
- cite ← Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations — Physics-informed neural networks rely on the universal approximation capacity of feedforward networks to represent PDE solution functions.