Skip to content

Mining association rules between sets of items in large databases

Why this mattered

TBD

Abstract

We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant association rules between items in the database. The algorithm incorporates buffer management and novel estimation and pruning techniques. We also present results of applying this algorithm to sales data obtained from a large retailing company, which shows the effectiveness of the algorithm.

  • citeClassification and Regression Trees. — Association-rule mining contrasts with CART-style decision-tree classification as a different method for discovering structure in large datasets.
  • citeInduction of Decision Trees — Association-rule mining cites decision-tree induction as prior work on rule-like knowledge discovery from structured data.
  • citeClassification and Regression Trees. — Association-rule mining relates to CART because both extract predictive or descriptive rules from tabular item-attribute data.
  • enablesClassification and Regression Trees. — CART established tree-based rule induction over tabular data, a precursor to association-rule mining's search for predictive itemset rules in transaction databases.
  • enablesInduction of Decision Trees — ID3 showed how to induce interpretable decision rules from data, which association-rule mining generalized to discovering frequent co-occurrence rules among itemsets.
  • enablesClassification and Regression Trees. — CART's recursive partitioning of discrete attributes helped frame market-basket data as rule-discoverable itemset splits.

Sources