Share |

Picking up the long tail of scientific computations

An image of the Long Tail as in use by the book of Chris Anderson.

The Long Tail: It typically refers to the small sites that make up the bulk of the internet's content, but can also refer to all the data produced by smaller and more numerous research groups within science. The CHAIN Science Gateway may help these less-well-funded groups by giving them access to free and global computer resources. Image courtesy , Hay Kranen, Wikimedia.

During the European Grid Infrastructure Technical Forum in Prague, Czech Republic, Steve Tuecke, deputy director of the Computation Institute at Argonne National Laboratory, US, talked about a crisis in IT represented by the ‘dark data’ on the long tail of science. A large proportion of researchers are unable to take advantage of the scalable computing resources that better-funded research groups have access to. We have exceptional computing infrastructure for the 1%, but not for 99%, says Tuecke. The CHAIN Science Gateway may answer this problem with a new and simple online registration process to enable any researcher, student, or citizen scientist to access a global computing grid spread across Eurasia, North America, South America, and Africa.  

For the first time, CHAIN, which stands for Coordination and Harmonisation of Advanced e-Infrastructures, has developed an online portal that builds on the work of the Catania Science Gateway Framework to make grid computing accessible to anyone worldwide. They have published a research paper about the gateway in the journal Studies in Health Technology and Informatics. The gateway is especially helpful for researchers who aren’t so familiar with programming, since it simplifies grid security certificates according to the access needs of a user.

One grid to rule them all

An image of the global reach, including all the middleware and grid infrastructures, of the CHAIN Science Gateway.

CHAIN Science Gateway demonstration: Projects, middleware, and grid infrastructures associated to the CHAIN worldwide interoperability demo. Currently, there are about 20 computing sites split across four continents. At the IEEE Cluster 2012 event, which started on Monday 24 September, in Beijing, the CHAIN project team held a meeting to discuss how to turn the demo into a real service. Image courtesy Roberto Barbera.

Developers of the science gateway used the Simple API for Grid Applications (SAGA) standard and its JSAGA implementation, developed at the IN2P3 computing center in Lyon. Through adapters, it links up all the various middlewares of individual e-Infrastructures connected to the network. User grid certificates are substituted by robot certificates. An API, the Catania Grid Engine, has been built on top of JSAGA to interact directly with scientific applications run on the grid, independent of the middleware, making it easier for new scientific applications to be added.

This reduces the barrier to entry for researchers’ put off by the term ‘grid computing.’ “It may have been a mistake to describe distributed computing as grids, clusters, and clouds,” said Roberto Barbera, from the University of Catania, at the Italian National Institute of Nuclear Physics, and the technical coordinator of the CHAIN project. “By accessing the grid in this way, a user doesn’t need to know what infrastructure they’re using. The only limit for the user is for them to decide what type of applications they want to run.”

The CHAIN Science Gateway provides access to tens of thousands of cores and seamlessly connects eight grids, including the European Grid Infrastructure, RedCLARA in South America, CNGrid in China, and even a grid infrastructure in Syria.

If a new e-infrastructure wants to join they can either use an existing JSAGA adapter or CHAIN’s developers can work with the new e-infrastructure technicians to setup JSAGA. For example, in the case of OurGrid, JSAGA adapters were built from scratch said Barbera.

A global computing grid for everyone?

An image of the CHAIN Science Gateway Facebook application.

CHAIN Facebook app: You can even access the global grid directly through CHAIN's Facebook page. On the page you can click straight through to the app. If you are authorized with your social media credentials, you can do a single sign on directly through the portal and start running applications. Image courtesy Roberto Barbera.

In order to join this global grid, all an existing or new user has to do is input or create their username and password respectively. Then their profile becomes a single federated identity which will automatically transfer to any e-infrastructure they wish to access on the network.

Currently, a number of research communities are running applications such as ASTRA for the arts and humanities and GROMACS for the biomedical community. Life sciences have benefited the most from the CHAIN Science Gateway according to Rafael Mayo of CIEMAT, Spain, during a presentation at the EGI Technical Forum.

Alexandre Bonvin is a computational structural biologist at Utrecht University who is coordinating the WeNMR project, which is part of the CHAIN network, and performs computations on the WeNMR e-Infrastructure. Bonvin says CHAIN represents not just a technical bridge between infrastructures, but also a way to build bridges with new users trying to connect to his research group. “We have been able to talk with research groups and institutes in Latin America for example. By having direct contacts and knowing who the local grid certificate authorities are we can better guide new users to access our computing resources, lowering the barrier to the use of our e-Infrastructure,” says Bonvin

We could be seeing the opening of computing grids to citizen scientists and interested members of the public. Just as small websites make up the bulk of the internet's content, known commonly as the long tail, the CHAIN Science Gateway may enable smaller research groups or individuals to perform more ambitious scientific computations, picking up the long tail of scientific data.

“We’re now planning to build a software layer to incorporate federated cloud computing to add more computing power to the network. We also have Google Android and Apple IOS APIs, which means people with mobile phones can upload and download data from the global e-infrastructure. This is important for continents where the majority of people access the web via mobile appliances,” said Barbera.

But, work needs to be done to turn the raw data output of the grid into meaningful results that a researcher can interpret in the form of tables, graphs, or visualizations. This is the next challenge; on 1 December 2012, the CHAIN-REDS project will focus on this.

Your rating: None Average: 4.6 (9 votes)

Comments

Post new comment

By submitting this form, you accept the Mollom privacy policy.