| Comparison of actual and simulated earthquake waves used by the Southern California Earthquake Center CyberShake workflows Image courtesy of Robert Graves, SCEC
|
Efficient, hands-free workflow management Pegasus takes a modular approach to workflow systems, working in partnership with workflow execution engine DAGMan, which was developed by the Condor team at the University of Wisconsin and shares in this month’s OCI grant. The Pegasus/DAGman combination is already used in a variety of scientific applications, including gravitational-wave physics project LIGO, astronomy software Montage and applications at the Southern California Earthquake Center, and can scale to manage large numbers of tasks. “Some workflows consume terabytes of data and can also produce terabytes of data,” says Deelman. “Some of the SCEC workflows have in the order of 100,000 tasks.” Another bonus is that Pegasus cleans as it works. “It minimizes its workflow storage requirements by cleaning up the data produced by the workflow as the computation progresses; it removes data that has already been processed to keep the workflow footprint as small as possible, which helps improve efficiency.” “Pegasus can also cluster tasks in the workflow,” she adds. “This reduces waiting times, which can add up when you need to manage lots of smaller tasks. This can significantly improve efficiency when you’re dealing with workflows with hundreds of thousands of nodes.” Deelman says she hopes to continue to leverage collaborations with science communities to advance the state of the art in workflow technology. Pegasus and DAGMan were first developed under the National Science Foundation GriPhyN project. - Cristy Burne, iSGTW |