Initial sequencing and analysis of the human genome¶
Why this mattered¶
This paper mattered because it turned the human genome from an object of aspiration into a public reference infrastructure. The International Human Genome Sequencing Consortium’s draft sequence, published in Nature in 2001, gave biology a shared coordinate system for locating genes, variants, repeats, conserved regions, and disease-associated loci across the genome (Nature). Its significance was not simply that “the genome was sequenced,” but that the sequence was made freely available and paired with an initial analysis showing that human biology could now be studied at genome scale rather than gene by gene.
The results also changed expectations about what made humans biologically complex. The paper’s estimate of far fewer protein-coding genes than many had expected, together with its analysis of repetitive elements, segmental duplications, chromosome structure, and evolutionary traces, helped shift attention from simple gene counts toward regulation, genome architecture, variation, and comparative genomics. After this, questions about development, disease, and evolution could be framed against a nearly genome-wide map rather than isolated molecular fragments.
Many later breakthroughs depended on this reference: dense SNP maps, the HapMap, genome-wide association studies, cancer genome sequencing, clinical variant interpretation, ENCODE-style functional annotation, and eventually routine whole-genome and exome sequencing. The 2001 draft was incomplete and later refined, but it established the paradigm that biology and medicine would increasingly be data-rich, reference-based, and computationally comparative.
Abstract¶
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
Related¶
- cite → Identification of common molecular subsequences — The Human Genome paper cites the common molecular subsequence algorithm as a foundation for sequence alignment used in genome assembly and comparison.
- cite → The Sequence of the Human Genome — Both papers report draft human genome sequences, linking the public consortium assembly to Celera's parallel whole-genome shotgun sequence.
- enables ← Identification of common molecular subsequences — Common-subsequence dynamic programming enabled sequence alignment methods used to assemble and annotate the human genome.
Sources¶
- DOI: https://doi.org/10.1038/35057062
- OpenAlex: https://openalex.org/W2168909179