Classification and Regression Trees.¶

Why this mattered¶

Classification and Regression Trees made recursive partitioning a general-purpose statistical method rather than a collection of ad hoc decision rules. Its central shift was to treat prediction as the construction of a data-adaptive tree: split the feature space into regions, assign simple predictions within each region, and use systematic criteria for splitting, pruning, and validation. That made nonlinear interactions, threshold effects, and heterogeneous subpopulations practical to model without specifying a parametric form in advance. In the authors’ framing, this was also a computational turn: the method depended on search, resampling, and optimization procedures that were not natural extensions of hand calculation.

What became newly possible was an interpretable predictive procedure that could handle both classification and regression, mixed variable types, missingness strategies, and complex decision boundaries within one framework. The paper’s lasting importance lies less in any single tree than in the statistical discipline it imposed on tree construction: impurity measures, cost-complexity pruning, and honest assessment of predictive performance helped make trees usable as data analysis tools rather than merely descriptive diagrams.

CART also supplied the conceptual substrate for several later breakthroughs in machine learning. Bagging and random forests extended trees by reducing their instability through aggregation; boosting turned sequences of weak trees into highly accurate predictors; gradient-boosted tree systems later became dominant tools for structured data. In that sense, CART helped define a new paradigm: flexible, algorithmic, computer-intensive statistical learning methods whose power came from adaptive structure rather than closed-form models.

Abstract¶

The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

enables → Regression Shrinkage and Selection Via the Lasso — CART popularized prediction via model selection and regularization tradeoffs, which lasso addressed for linear regression through L1 shrinkage.
enables → Extended-Connectivity Fingerprints — CART popularized tree-based decision rules, the kind of interpretable branching structure later used to reason about molecular substructures encoded by extended-connectivity fingerprints.
enables → Bagging Predictors — CART supplied the high-variance decision-tree learners that bagging stabilized through bootstrap aggregation.
enables → Mining association rules between sets of items in large databases — CART established tree-based rule induction over tabular data, a precursor to association-rule mining's search for predictive itemset rules in transaction databases.
cite ← Regression Shrinkage and Selection Via the Lasso — The lasso contrasts its continuous shrinkage-and-selection procedure with CART's tree-based variable selection and prediction framework.
cite ← Extended-Connectivity Fingerprints — Extended-connectivity fingerprints cites CART as a decision-tree learning method used for molecular classification and regression tasks.
cite ← Bagging Predictors — Bagging Predictors uses classification and regression trees as unstable base learners whose variance can be reduced by bootstrap aggregation.
cite ← Mining association rules between sets of items in large databases — Association-rule mining contrasts with CART-style decision-tree classification as a different method for discovering structure in large datasets.

Sources¶

DOI: https://doi.org/10.2307/2530946
OpenAlex: https://openalex.org/W1594031697

Classification and Regression Trees.¶

Why this mattered¶

Abstract¶

Related¶

Sources¶