In this special feature iSGTW chats to Les Robertson, who recently stepped down after six years at the head of the Large Hadron Collider Computing Grid.
In the beginning
Les Robertson arrived at CERN in 1974 to fix a problem. The European physics research laboratory had just purchased a new supercomputer. The problem, says Robertson, was that it didn’t work.
“At that time customers fixed their own operating systems,” he explains. “I arrived as an operating systems expert and stayed on.”
Twenty-seven years later, Robertson began work on an entirely different problem: Preparations for the Large Hadron Collider were well underway, but the computing resources required to handle LHC data had been left behind.
“Computing wasn’t included in the original costs of the LHC,” Robertson explains. “The story was that you wouldn’t be able to estimate the costs involved, although the estimates we made at the time have proven to be more or less correct.” This decision left a big hole in funding for IT crucial to the ultimate success of the LHC.
“We clearly required computing,” says Robertson, “but the original idea was that it could be handled by other people.”
In 2001, these “other people” had not stepped forward.
“There was no funding at CERN or elsewhere,” Robertson says. “A single organization could never find the money to do it. We realized the system would have to be distributed.”
CERN began asking countries to help. The charge was led by the UK, who contributed a big chunk of e-science funding, closely followed by Italy, who continues to supply substantial funding to CERN. Germany also donated a chunk of funding, and then, says Robertson, other countries followed suit.
“This money gave us a big boost,” he says. “It allowed us to create something much bigger.”
In 1999 Harvey Newman from Caltech had initiated the Monarc project to look at distributed architectures that could integrate computing resources for LHC, no matter where they were located. At around the same time, Carl Kesselman and Ian Foster carved a spot on the world stage for the Grid.
“Their book motivated the idea of doing distributed computing in a very general way,” Robertson says. “It stimulated everyone’s interest. We decided to ride the wave.” But the Grid has not become the panacea, says Robertson. “It has become 250 different things, which has led to benefits and problems. Standards haven’t emerged in the way we expected, nor have off the shelf products.”
A big success of the LCG has been the involvement of multiple centers from around the world.
“Different countries, universities, labs…We have over 110 Tier-2 centers up and running, some big and some very small, but all delivering resources to the experiments,” Robertson explains. “Many of these are computing centers that haven’t been a fundamental part of the experiments environment before, and we’ve all put a lot of effort into working as a collaboration, sensitizing people to what will be required when the first data starts to arrive. The advantage is that all these centers are now involved in the experiments and so there are many options for injecting new resources when they are required.”
When asked about the challenges he faced as head of the LCG project, Robertson laughs wryly. “There were several big problems,” he says, “and they were all a bit the same.”
Immediate: Stabilizing operations
Mid-term: Managing the data
Long-term: Managing energy requirements
So is Robertson confident that all will go according to the LCG plan when the first proton beams race through the LHC? He’s hoping!
“There is a lot of work still to be done,” he says. “This is new, this idea that you start a machine and the computing required is not all at the same place as the machine. It hasn’t actually been done before. When the beams come, we don’t know what will happen. Things will be chaotic, people will want things we didn’t expect. But HEP is showing that this highly distributed environment is useable. Physicists are no longer dependent on CERN having all the funding or CERN deciding on priorities. We’ve created a democratic environment where you can plug in computer resources wherever you find it. In principle, that was the real goal of the grid.”
- Cristy Burne, iSGTW