Skip to content

A new look at the statistical model identification

Why this mattered

Akaike’s 1974 paper shifted model choice from a hypothesis-testing problem to an information-theoretic prediction problem. Instead of asking whether a restricted model could be rejected against a larger one, Akaike framed statistical identification as choosing the model expected to lose the least information relative to the unknown data-generating process. The resulting criterion, AIC = -2 log L + 2k, made the tradeoff explicit: better fit is rewarded, but each independently adjusted parameter is penalized. This was paradigm-shifting because it gave researchers a general, computable rule for comparing non-nested and competing models, avoiding many ambiguities of sequential significance testing.

What became newly possible was routine model selection across realistic candidate families, especially in time series, forecasting, econometrics, ecology, psychology, and later machine learning-adjacent statistical modeling. AIC turned maximum likelihood estimation into a practical model identification workflow: estimate each candidate model, compute a single score, and prefer the model with the smallest estimated information loss. That made model comparison less dependent on arbitrary test order, null-model privilege, or fixed significance thresholds, and helped normalize the idea that models should be judged by expected out-of-sample adequacy rather than only by in-sample fit or formal rejection.

The paper also helped found a broader information-criterion tradition. Later criteria such as BIC, AICc, DIC, WAIC, and cross-validation-based approaches differ in assumptions and goals, but they all inhabit a landscape that Akaike’s paper made central: model selection as penalized predictive adequacy under uncertainty. Its influence is visible in modern statistical practice whenever researchers compare candidate models by balancing fit against complexity, and in the later development of model averaging, predictive validation, and information-theoretic approaches to scientific inference.

Abstract

The history of the development of statistical hypothesis testing in time series analysis is reviewed briefly and it is pointed out that the hypothesis testing procedure is not adequately defined as the procedure for statistical model identification. The classical maximum likelihood estimation procedure is reviewed and a new estimate minimum information theoretical criterion (AIC) estimate (MAICE) which is designed for the purpose of statistical identification is introduced. When there are several competing models the MAICE is defined by the model and the maximum likelihood estimates of the parameters which give the minimum of AIC defined by AIC = (-2)log-(maximum likelihood) + 2(number of independently adjusted parameters within the model). MAICE provides a versatile procedure for statistical model identification which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure. The practical utility of MAICE in time series analysis is demonstrated with some numerical examples.

Sources