PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses¶
Why this mattered¶
PLINK mattered because it turned whole-genome association studies from a bespoke, fragile computational exercise into a reproducible analytic workflow. In 2007, GWAS datasets had suddenly grown to hundreds of thousands of SNPs across thousands of individuals, outpacing many existing genetics tools. Purcell and colleagues’ contribution was not a new disease locus, but an open-source C/C++ toolkit that made routine GWAS operations fast and accessible: data management, quality control, summary statistics, population-stratification checks, association testing, permutation procedures, and identity-by-descent estimation could be run on whole datasets rather than improvised marker by marker (paper).
The paradigm shift was infrastructural. PLINK helped standardize what counted as a basic GWAS analysis: filtering poorly genotyped markers, detecting sample relatedness or ancestry structure, testing case-control and quantitative-trait association, and producing outputs that could be shared, checked, and extended. This mattered because the first wave of large GWAS in the mid-to-late 2000s depended not only on genotyping arrays and HapMap-style reference data, but also on software that could make population-scale genotype data usable by ordinary genetics groups rather than only by highly specialized computational teams.
Its downstream importance is visible in how PLINK became a common substrate for human genetics: pre-imputation quality control, cohort harmonization, association scans, relatedness checks, and population-genetic summaries in studies of complex disease, psychiatric genetics, anthropological genetics, and biobank-scale research. Later breakthroughs such as large meta-analyses, polygenic risk scoring, rare-variant and sequencing pipelines, and second-generation tools like PLINK 2 built on the expectation that genome-wide data could be manipulated quickly, transparently, and at scale. In that sense, PLINK did for GWAS practice what a successful scientific instrument often does: it made a new kind of measurement routine enough that the field could ask larger questions.
Abstract¶
(no abstract available)
Related¶
- cite → Inference of Population Structure Using Multilocus Genotype Data — PLINK cites STRUCTURE as a genotype-based method for inferring population ancestry and detecting stratification.
- cite → Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing — PLINK cites Benjamini-Hochberg false discovery rate control as a multiple-testing correction option for genome-wide association results.
- cite → Principal components analysis corrects for stratification in genome-wide association studies — PLINK cites EIGENSTRAT/PCA correction as a method for controlling population stratification in genome-wide association studies.
- enables → The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans — PLINK enabled GTEx's genome-wide genotype quality control and association testing infrastructure for eQTL analysis.
- cite ← GCTA: A Tool for Genome-wide Complex Trait Analysis — GCTA complements PLINK by adding genome-wide mixed-model estimation of genetic variance from SNP data.
- cite ← The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans — The GTEx pilot uses PLINK-style genome association tooling for genotype quality control and variant analysis.
- enables ← Inference of Population Structure Using Multilocus Genotype Data — STRUCTURE's multilocus genotype framework for population stratification enabled PLINK's population-based association analysis tools and stratification-aware workflows.
- enables ← Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing — Benjamini-Hochberg false discovery rate control enabled PLINK to support genome-wide multiple-testing correction for large association scans.
Sources¶
- DOI: https://doi.org/10.1086/519795
- OpenAlex: https://openalex.org/W2161633633