Feature - In case of emergency, call SPRUCE
When disaster strikes, simulations could give authorities the information they need to save lives. But simulations are computationally intensive, and during a crisis, there’s no time to wait in line for access to computer resources. That’s where urgent computing comes in.
“What you really want is to be able to hook together or have access to all the supercomputers that you need, wherever they are,” said Pete Beckman, project lead for TeraGrid’s Special PRiority and Urgent Computing Environment, or SPRUCE. “The purpose of this sort of urgent computing infrastructure is to design and create the infrastructure before it’s needed.”
Simulation applications take too long to develop on the fly. Instead, the SPRUCE team collaborates with researchers to develop the code in advance.
SPRUCE’s first partnership was with the National Science Foundation’s Linked Environments for Atmospheric Discovery project. By giving LEAD access to supercomputing resources on TeraGrid, said Beckman, they made it possible to ask questions like, “If the hurricane hits here, how much will the water rise over there?”
More recently, the SPRUCE team has been collaborating with the Network Dynamics and Simulation Science Laboratory at Virginia Tech to simulate an outbreak of influenza – such as H1N1 – via SPRUCE.
“Recently that team used SPRUCE to investigate a number of mitigation strategies for responding to epidemics caused by flu-like illnesses,” said Beckman. These investigations attempted to answer questions such as, “When would it be appropriate to close schools, or what is the best way to allocate scarce resources such as vaccines and antivirals in the event of a pandemic?”
The collaboration has been fruitful for Madhav Marathe, a member of the NDSSL. "SPRUCE provided the much-needed computational resources to complete the analysis in a short period of time," said Marathe. "New high resolution interactive simulations like these, combined with resources provided by SPRUCE, could be used to support real-time decision making by policy makers during a pandemic."
Normally, to access computer resources, researchers must submit their job to a queue, and then wait for their job's turn to come up. Once SPRUCE is up and running, however, researchers will be able to get nearly instant access to computer resources during an emergency. “The two most common modes are what we call next-to-run, which means you don’t actually cut anyone off, but you cut to the front of the line,” said Beckman. “The next option is pre-emption.”
Using the SPRUCE web portal, each computer resource provider will be able to choose how and to whom they provide computational cycles. “They have to be the ones who are in a position to say, ‘Okay, these three groups have the ability to dial 911,’” said Beckman. Some might choose to provide next-to-run access to all emergency researchers. Others might provide pre-emptive access, but only to a handful of pre-selected researchers.
For those researchers who do get pre-empted, not all is lost. Computationally intensive applications are designed to write out their intermediate data every few hours. Thus, pre-empted simulations would lose at most an hour or two of work.
SPRUCE is still in research mode – and will remain so for the foreseeable future. “Until it’s routine that we ask the computer for help with predicting what’s going to happen, or the best path,” said Beckman, “urgent computing will still be something we do in the laboratory.”
—Miriam Boon, iSGTW