 |
|
A screenshot of the Einstein@Home screensaver. Image courtesy of Einstein@Home.
|
For over five years, volunteers have been lending their computers’ spare cycles to the Laser Interferometer Gravitational Wave Observatory (LIGO) and GEO-600 projects via the BOINC application Einstein@Home. Now a new application wrapper, dubbed “Einstein@OSG,” brings the application to the Open Science Grid.
Today, although Einstein@OSG has been running for only six months, it is already the top contributor to Einstein@Home, processing about 10 percent of jobs.
“The Grid was perfectly suitable to run an application of this type,” said Robert Engel, lead developer and production coordinator for the Einstein@OSG project. “BOINC would benefit from every single CPU that we would provide for it. Increasing the number of CPUs by 1000 really results in 1000 times more science getting done.”
Getting Einstein@Home to run on a grid was not without difficulties. Normally, a volunteer would download and install the application. The application would constantly download data, analyze it, and then return the results. In short, each instance of Einstein@Home has a permanent home on a volunteer’s computer.
The same process would not work on the Grid. Grid jobs cannot run indefinitely, so each instance of Einstein@OSG was given a time limit.
“Once the time limit is up, the Einstein@Home application exits, followed by the Einstein@OSG application, which will save all results to an external storage location,” Engel explained. “The next time Einstein@OSG starts, it likely starts on a different cluster node which may use a different architecture.”
Next, the Einstein@OSG application detects changes in the environment, such as the architecture, location, version of software, or network connectivity, and then compiles any missing software ‘on-the-fly.’ After a final check to verify that all requirements for Einstein@Home are met, it starts up. The results from the previous run are loaded from the remote storage location, and Einstein@Home picks up where it left off.
An application on a grid will encounter software and hardware issues much more frequently than a desktop application such as Einstein@Home, according to Engel. This is because grids are much more complex, and deal with an extremely high volume of jobs.
Because the average Einstein@Home user will only encounter an error every couple of months, it’s practical for her to handle the error manually. With Einstein@OSG running on up to 10,000 cores, however, there are errors every couple of minutes. Fixing these manually simply isn’t practical, so Einstein@OSG eventually automated the process.
“It was only because of that mechanism that we were able to scale up,” Engel said. “A computer never gets tired looking for errors and fixing them, unlike me, who likes to sleep at night and spend time with his family.”
|