Skip to content

Identification of common molecular subsequences

Why this mattered

Smith and Waterman’s 1981 paper made local sequence alignment a mathematically rigorous optimization problem. Earlier comparison methods could find similarities, but this work gave a dynamic programming algorithm that identifies the highest-scoring shared subsequences between two molecular sequences under an explicit scoring scheme. The key shift was from asking whether two full sequences could be globally aligned to asking where, within longer sequences, statistically and biologically meaningful conserved regions occur.

That mattered because biology often preserves domains, motifs, active sites, and exons rather than entire molecules end to end. The Smith-Waterman algorithm made it possible to detect these local similarities exactly, providing a foundation for inferring homology, function, and evolutionary relationships from DNA, RNA, and protein sequences. As sequence databases grew, this exact formulation became the benchmark against which faster heuristic tools were judged.

Its influence runs through later computational biology: BLAST and FASTA traded exact optimality for speed, but their central task was shaped by the local-alignment problem Smith and Waterman formalized. The paper helped turn molecular sequence comparison into a core quantitative method of genomics, enabling database search, annotation of newly sequenced genes, comparative genomics, and the interpretation of conserved functional elements across species.

Abstract

(no abstract available)

Sources