GCTA: A Tool for Genome-wide Complex Trait Analysis¶
Why this mattered¶
Before GCTA, GWAS had made it possible to identify individual variants associated with complex traits, but it also sharpened the “missing heritability” problem: significant SNPs usually explained only a small share of familial or twin-based heritability. Yang, Lee, Goddard, and Visscher changed the unit of analysis. Instead of asking which SNPs passed a genome-wide significance threshold, GCTA estimated how much phenotypic variance could be explained collectively by all genotyped SNPs, using genetic relatedness matrices and mixed linear models. This made common, individually tiny effects measurable in aggregate.
The paradigm shift was methodological and conceptual: complex traits could be treated as highly polygenic even when few loci were individually detectable. GCTA gave researchers a practical tool for quantifying SNP heritability, partitioning genetic variance across chromosomes, estimating X-chromosome contributions, examining linkage disequilibrium structure, and simulating GWAS data. It helped move human genetics beyond cataloging “hits” toward estimating genetic architecture: how much signal exists, where it is distributed, and how discovery power scales with sample size.
Its influence is visible in later work on GREML analyses, SNP heritability estimation, genetic correlation, functional partitioning of heritability, and polygenic prediction. GCTA did not solve missing heritability by itself: it measured the fraction tagged by available SNPs under specific assumptions. But by showing that substantial heritable signal could be recovered from genome-wide marker similarity, it supplied a quantitative bridge between early GWAS and the large-scale biobank era, where polygenicity, variance partitioning, and prediction became central organizing ideas.
Abstract¶
(no abstract available)
Related¶
- cite → Principal components analysis corrects for stratification in genome-wide association studies — GCTA cites principal-components correction as a standard way to control population stratification in genome-wide association analyses.
- cite → PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses — GCTA complements PLINK by adding genome-wide mixed-model estimation of genetic variance from SNP data.
- enables → LD Score regression distinguishes confounding from polygenicity in genome-wide association studies — GCTA's genome-wide variance-component modeling of polygenic signal enabled LD Score regression's distinction between true polygenicity and confounding.
- cite ← LD Score regression distinguishes confounding from polygenicity in genome-wide association studies — LD Score regression contrasts its summary-statistic heritability and confounding estimates with GCTA's individual-level GREML variance-component approach.