Skip to content

Multilayer feedforward networks are universal approximators

Why this mattered

Hornik, Stinchcombe, and White’s paper helped settle a foundational question about neural networks: whether their usefulness depended on special properties of particular problems, or whether feedforward networks had a general representational capacity. The paper showed that multilayer feedforward networks with appropriate activation functions can approximate any Borel-measurable function to arbitrary accuracy, under suitable conditions. This shifted neural networks from being viewed mainly as heuristic pattern-recognition devices toward being mathematically legitimate function approximators.

The result did not show that such networks could be efficiently trained, that they would generalize well, or that a practical architecture could be found for every task. Its importance was more basic: it established that the limitation of neural networks was not, in principle, expressive power. After this, central research questions could move toward optimization, data, regularization, architecture, and computation. That distinction became crucial for later developments, from backpropagation-driven multilayer networks to deep learning systems whose success depended on scalable training rather than merely on representational possibility.

Historically, the paper sits alongside other universal approximation results from the late 1980s, including Cybenko’s theorem, but Hornik, Stinchcombe, and White gave the claim a broad and influential formulation for multilayer feedforward networks. Its legacy is that it supplied one of the mathematical pillars beneath the modern neural-network paradigm: neural networks could be understood as general-purpose approximators, making them plausible candidates for modeling complex nonlinear phenomena across vision, speech, language, control, and scientific prediction.

Abstract

(no abstract available)

Sources