Microbes, the oldest and most numerous creatures on Earth, have a rich genomic history that offers clues to changes in the environment that have occurred over hundreds of millions of years.
While scientists are becoming increasingly aware of the many important environmental roles played by microbes living today--they process the food in our intestines, they keep carbon moving through the ocean food web, they can be harnessed to process sewage and build specific proteins--they still know little about these tiny critters, particularly marine microbes, which generally are classified into species based on their ecological niche. For instance, two species of marine microbe might look very similar physically, but one may have adapted to life in a particularly dark part of the ocean, while its sister species may have adapted to feeding off a nutrient that is rare in most parts of the ocean, but exists in abundance in one small area.
Scientists at MIT who are trying to understand existing microbes by studying their genetic history recently created a new approach to the study of microbial genomes that may hasten our collective understanding of microbial evolution.
The researchers have reversed the usual order of inquiry, which is to study an organism, then try to identify which proteins and genes are involved in a particular function. Instead, they have come up with a simple mathematical formula that makes it possible to analyze a gene family (a single type of gene or protein that exists in many creatures) simultaneously in a group of ecologically distinct species.
This means that we can begin to identify occurrences of natural selection in an organism's evolution simply by looking at its genome and comparing it with many others at once. This would allow them to take advantage of the nearly 2,500 microbes whose genomes have already been sequenced.
The new method determines the "selective signature" of a gene, that is, the pattern of fast or slow evolution of that gene across a group of species, and uses that signature to infer gene function or to map changes to shifts in an organism's environment.
"By comparing across species, we looked for changes in genes that reflect natural selection and then asked, 'How does this gene relate to the ecology of the species it occurs in?'" said Eric Alm, the Doherty Assistant Professor of Ocean Utilization in the Departments of Civil and Environmental Engineering and Biological Engineering. Natural selection occurs when a random genetic mutation helps an organism survive and becomes fixed in the population. "The selective signature method also allows us to focus on a single species and better understand the selective pressures on it," said Alm.
"Our hope is that other researchers will take this tool and apply it to sets of related species with fully sequenced genomes to understand the genetic basis of that ecological divergence," said graduate student B. Jesse Shapiro, who coauthored with Alm a paper published in the February issue of PLoS Genetics.
Their work also suggests that evolution occurs on functional modules--genes that may not sit together on the genome, but that encode proteins that perform similar functions.
"When we see similar results across all the genes in a pathway, it suggests the genomic landscape may be organized into functional modules even at the level of natural selection," said Alm. "If that's true, it may be easier than expected to understand the complex evolutionary pressures on a cell."
For example, in Idiomarina loihiensis, a marine bacterium that has adapted to life near sulfurous hydrothermal vents in the ocean floor, the genes involved in metabolizing sugar and the amino acid phenylalanine underwent significant changes (over hundreds of millions of years) that may help the bacterium obtain carbon from amino acids rather than from sugars, a necessity for life in that ecological niche. In one of I. loihiensis' sister species, Colwellia psychrerythraea, some of those same genes have been lost altogether, an indication that sugar metabolism is no longer important for Colwellia.
Shapiro and Alm focused on 744 protein families among 30 species of gamma-proteobacteria that shared a common ancestor roughly one to two billion years ago. These bacteria include the laboratory model organism E. coli, as well as intracellular parasites of aphids, pathogens like the bacteria that cause cholera, and soil and plant bacteria. They mapped the evolutionary distance of each species from the ancestor and incorporated information about the gene family (for instance, important proteins evolve more slowly than less-vital ones) and the normal rate of evolution in a particular species' genome in order to determine a gene's selective signature.
"These are experiments we could never perform in a lab," said Alm. "But Mother Nature has put genes into an environment and run an evolutionary experiment over billions of years. What we're doing is mining that data to see if genes that perform a similar function, say motility, evolve at the same rate in different species. To the extent that they differ, it helps us to understand how change in core genes drives functional divergence between species across the tree of life."
This work is part of the Virtual Institute for Microbial Stress and Survival. The research was also supported by additional grants from the U.S. Department of Energy Genomics: GTL Program, the National Institutes of Health, and a scholarship from the Natural Sciences and Engineering Research Council of Canada.
A version of this article appeared in MIT Tech Talk on April 2, 2008 (download PDF).