Skip to content

Statistical significance for genomewide studies

Why this mattered

Before genomewide assays, statistical significance was largely framed around guarding each individual claim against being a false positive. Storey and Tibshirani helped shift the unit of evidence from the single test to the discovery set: in experiments with thousands of genes, markers, or features, the practical question was not “can we make every false positive unlikely?” but “among the findings we call significant, what fraction should we expect to be false?” Their q-value gave researchers a p-value-like quantity tied to the false discovery rate, making large-scale biological screening interpretable without collapsing under overly conservative thresholds.

That change made genomewide discovery a routine statistical object. It allowed investigators to report ranked lists of genes, loci, transcripts, or molecular features with an explicit expected error rate, preserving power while acknowledging that some false positives are inevitable in high-throughput biology. This was especially important for microarray studies, early genome scans, and later omics workflows, where the goal was often to nominate candidates for follow-up rather than to establish a single definitive association.

The paper mattered because it helped normalize a discovery-oriented statistical culture that subsequent genomics depended on: differential expression analysis, eQTL mapping, genome-wide association studies, proteomics, methylation studies, and other high-dimensional screens all needed principled ways to separate signal-rich candidate sets from noise. Building on the false discovery rate framework introduced by Benjamini and Hochberg, Storey and Tibshirani’s PNAS paper made q-values a practical language for genome-scale evidence.

Abstract

With the increase in genomewide experiments and the sequencing of multiple genomes, the analysis of large data sets has become commonplace in biology. It is often the case that thousands of features in a genomewide data set are tested against some null hypothesis, where a number of features are expected to be significant. Here we propose an approach to measuring statistical significance in these genomewide studies based on the concept of the false discovery rate. This approach offers a sensible balance between the number of true and false positives that is automatically calibrated and easily interpreted. In doing so, a measure of statistical significance called the q value is associated with each tested feature. The q value is similar to the well known p value, except it is a measure of significance in terms of the false discovery rate rather than the false positive rate. Our approach avoids a flood of false positive results, while offering a more liberal criterion than what has been used in genome scans for linkage.

Sources