iSGTW - International Science Grid This Week
iSGTW - International Science Grid This Week
Null

Home > iSGTW 22 July 2009 > Feature - Computing enables identification of microbe DNA in soil

Feature - Computing enables identification of microbe DNA in soil


Jonathan Eisen collects samples gathered by the Alvin submarine during a recent trip to the ocean floor.

Image courtesy of J. Eisen, UCD.

The traditional method for studying a microbe is to cultivate it in the lab and examine its biology in detail. However, lab cultivation is possible for only a small fraction of microbe species. Scientists have thus turned to metagenomics – the computation-reliant study of DNA extracted from environmental samples rather than from cultivated organisms.

In metagenomics, scientists grind up samples containing many different organisms and extract all the DNA they can, not knowing which pieces of DNA came from which organisms. A one-gram soil sample can contain up to several million species of microbes all mixed together. The scientists sequence small, random fragments of the DNA to identify species and determine how they function, explained Jonathan Eisen, University of California, Davis researcher and head of the Genomic Encyclopedia of Bacteria and Archaea project of the Department of Energy’s Joint Genome Institute, which aims to catalogue genomic data for all major branches of microorganisms.

“Metagenomics is very much pushed by the available sequencing technology, and it totally depends on the algorithms and computing to make sense of the data,” said Folker Meyer who runs Argonne National Laboratory’s MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) server, the primary data repository and analysis resource for the metagenomics community.

At Argonne National Laboratory computational biologists Folker Meyer and Elizabeth Glass view charts of metagenomic data analysed using grid computing resources.

Image courtesy of ANL.

MG-RAST, which came online in 2008, is a free, fully automated online service for annotating the metagenome (the set of fragments of sequenced DNA) of an environmental sample. With over 1,500 users, currently MG-RAST houses more than 2,600 private and about 300 public metagenome datasets.

Researchers upload their sample’s metagenome, and MG-RAST uses a variety of computing resources – Argonne’s 800-core cluster, TeraGrid and cloud computing – to compare the DNA fragments to those from every other sample in the system as well as to gene sequences in several other publicly-available databases. Via its relationship with the nonprofit organization Fellowship for the Interpretation of Genomes on its “Project to Annotate 1,000 Genomes,” the MG-RAST team also has access to a large basis of smaller curated genome data sets. The software uses similarity to known genes to guide the reconstruction of the various species in the sample and to provide information on their functions.

The databases do not contain the genome of every species of microbe, often making it difficult to classify the organisms in a sample. “It is estimated there are at least 200 major groups of bacteria, and we (the public sector) only have genome data for about 10 of them,” said Eisen.

“Although there is still much work ahead, metagenomics provides a powerful new tool to help researchers better understand microbes they cannot grow in the lab,” Meyer said.  “Metagenomics is more or less unleashing our ability to study the genomics of microbes from all sorts of environments across the planet.”

Amelia Williamson, for iSGTW

Tags:



Null
 iSGTW 1 September 2010

Feature - The forecast before the storm

Q&A - Joe Hellerstein on cloud programming

Q&A - People behind EGI: Steve Brewer steps in as the voice of the user

Poll of the week - Rock stars of scientific computing

Videos of the week - NoHardware.com destroys server huggers' equipment

 Announcements

Symposium on Authentication Technologies for Research and Education abstracts due

Grace Hopper early bird registration due

Gordon Conference 2010 abstracts due

Jobs in distributed computing

 Subscribe

Enter your email address to subscribe to iSGTW.

Unsubscribe

 iSGTW Blog Watch

Keep up with the grid’s blogosphere

 Mark your calendar

September 2010

August 29-Sept 3, CERN School of Computing

2-3, Citizen Cyberscience Summit

6-8, IASTED in Botswana

6-9, PRACE Training Week

6-10, GridKa School 2010

13-15, CaBIG

13-16, UK All Hands Meeting

14-17, EGI Technical Forum

20-24, Cluster 2010

27-29, ICT 2010

21-23, Cybera Summit 2010

More calendar items . . .

FooterINFSOMEuropean CommissionDepartment of EnergyNational Science Foundation RSSHeadlines | Site Map