Share |

Lessons learnt from g-INFO

Grids, clusters and supercomputers are often a foreign concept to those not involved in the HPC world. One of the challenges developers face is making grids easy to use for non-grid users. And, as Vincent Breton found out when developing a system to monitor bird flu, it’s never an easy one to solve.

When bird flu is detected, it is often in poultry. Tracking the genome in chickens helps researchers figure out if the virus is going to become a threat.  Image courtesy USAID.

A bird flu pandemic, caused by the H5N1 virus strain, is a constantly looming threat for countries in Southeast Asia. Since H5N1 was first isolated in 1996, a number of outbreaks have occurred throughout the region. Today the virus is endemic in birds and, fortunately, the virus has not yet evolved into a strain that can transmit easily between humans.

A key weapon against the spread of H5N1 is tracking the evolution of the virus’ genome whenever a new outbreak occurs. In this way virologists can monitor whether the virus is becoming more or less of a threat, helping to inform global health decisions and prevent a human pandemic. But researchers in countries such as Vietnam, where bird flu was first reported in 2004, do not always have access to computing resources for this.

So Vincent Breton, who leads the French NGI and is based in Aubière at the French National Institute for nuclear and particle physics (IN2P3), and his team developed the grid-based International Network for Flu Observation (g-INFO), based in Vietnam, to solve this problem.

“The original aim of g-INFO was to give researchers front line access to computing resources,” said Breton. In Vietnam, when a suspected case of bird flu strikes, a sample is sent to the Institute of Biotechnology in Hanoi for the virus to be isolated and its sequence analysed. Researchers then use this information to decide whether the strain is new or old and to monitor its evolution.

But what’s needed to do that? “The researcher analyzed a genome which he wants to compare, so he needs access to a data collection,” said Breton. The researcher will also need to process his data in batch mode and to access a set of computing resources – be that a grid or a cluster. g-INFO brings these three things together, providing a way to monitor flu on the grid for non-grid users. It pulls the latest genetic sequences of the H5N1 virus from a number of databases giving them a global viewpoint. 

“The system is working and operational,” said Breton. “We can extract sequences from databases and produce phylogenetic trees.”

Sounds great, right? But here comes the clincher: although the system is up and running, it is not being used by the virologists.

Why do some grid applications fail to take off?

There are three common challenges in getting non-grid users to use the grid: the policy of using grid resources, security and a customized user interface, according to Doan Trung Tung, who helped develop the g-INFO system at the Institute de la Francophonie pour l'Informatique, in Vietnam and the Université Blaise Pascal in France.

In the case of g-INFO, the first example of a grid application fully developed in Vietnam, a number of other factors were also in the mix. Close cooperation between developers and virologists was hampered due to time and location constraints.

“Another problem is that the grid connection in Vietnam is not stable,” said Tung. “In fact our grid nodes in Vietnam have lost the connection with EGI since November 2010.”

So what lessons can be learnt from the challenges g-INFO has encountered?

“It teaches us about humility, patience and perseverance,” said Breton. “Public health is an obvious field of application for grid technology but it takes more than a good idea and hard programming work for the grid to make an impact. It takes a deeper understanding of the requirements and time to customize the services to the real needs of the targeted users who have heavy responsibilities and limited time.”

For now, the g-INFO developers are using the platform to act as a prototype and a demonstrator for public health research. “At first, g-INFO [was developed to] provide biologists with an integrated database and bioinformatic tools for monitoring Influenza A molecular evolution. But as g-INFO has been designed with general concepts, it can be applied to other problems, not only Influenza A,” said Tung.

The platform will be adapted to search for new drugs coming from the biodiversity in Vietnam. They hope by keeping in close contact with chemists working on the project they can create a truly useable platform, and avoid the mistakes they may have made in the past.

Your rating: None Average: 4 (2 votes)


Post new comment

By submitting this form, you accept the Mollom privacy policy.