iSGTW - International Science Grid This Week
iSGTW - International Science Grid This Week
Null

Home > iSGTW - 28 January 2009 > iSGTW Opinion - Many Task Computing: Bridging the performance-throughput gap

Opinion - Many Task Computing: Bridging the performance-throughput gap


In the lower left corner, the low number of tasks and small input size makes tightly-coupled Message Passing Interface (MPI) quite manageable. This is the traditional terrain of HPC. 

As the data size increases (vertically), we move into the analytics category, such as data mining and analysis. 

In the lower right corner, data size remains modest, but the increasing number of tasks moves us into loosely-coupled applications involving many tasks. HTC can be considered a subset of this category.

Finally, the combination of both many tasks and large datasets in the upper right corner moves us into the province of Many-Task Computing. MTC can also be considered as part of the high-task, low data (lower right) area.

Image courtesy of Ioan Raicu, University of Chicago.

Tightly-coupled applications for which jobs must communicate between each other during execution are typically best served by clustered High Performance Computing (HPC). Applications with many independent job streams, on the other hand, are better suited to distributed High Throughput Computing (HTC). But there are still other kinds of applications.

Over the past half decade we’ve examined many applications from astrophysics, bioinformatics, data mining and other fields, and have found that high-performance computations comprising multiple distinct activities and coupled via file system operations (as opposed to the standard message passing interface commonly found in HPC) don’t fit nicely in either category. To address this, we’ve defined the concept of “Many Task Computing”.  We believe that it bridges a gap between these two dominant computing paradigms and opens up opportunities to apply HPC systems in new ways for increasingly complex applications that were simply intractable just a few years ago.

Millions to billions of tasks

Many Task Computing (MTC) involves applications with tasks that may be small or large, single or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, and loosely- or tightly-coupled. Applications may span millions to billions of tasks, entail tens of thousands of processor years, incorporate a degree of parallelism able to occupy the largest supercomputers at hundreds of thousands of processors, and operate on terabyte- to petabyte-size datasets.

Resource-, communication- and data-intensive applications

MTC differs from HTC in the timescale of task completion, and the often data-intensive nature of applications. It emphasizes the use of many resources over short periods of time to accomplish many computational tasks, both dependent and independent, with primary metrics measured in seconds (e.g., FLOPS, tasks/sec, megabytes/s, I/O rates), rather than operations (e.g., jobs per month).

The economic and health benefits of speeding drug development are significant. Screening all possible 3-D molecular combinations for a drug typically requires over a billion computations, taking over 50 days on a 160K processor Blue Gene/P supercomputer.

The graph represents a DOCK workload execution consisting of over 930,000 molecules on 116K processors on an MTC-enabled Blue Gene/P, completing in 2 hours.  The per-task execution time was quite varied, with a minimum of 1 second, a maximum of 5030 seconds, and a mean of about 700 seconds, and with significant I/O per task.

Image courtesy of Ioan Raicu, University of Chicago.  

MTC includes the loosely-coupled applications that are generally communication-intensive but not naturally expressed using the standard message passing interface. Such applications are commonly implemented in workflow systems or parallel programming systems. Applications that operate on or produce large amounts of data need sophisticated data management in order to scale, and are a natural fit for MTC.

Big impact on science

Efficient support of MTC applications on a wide range of resources will have a big impact on science. Our group is making progress in this direction. We have demonstrated good support for MTC on a variety of resources from clusters, grids, and supercomputers through our work on Swift, a highly scalable scripting language/engine to manage procedures composed of many loosely-coupled components, and Falkon, a novel job management system designed to handle data-intensive applications with up to billions of jobs.

Ioan Raicu and Ian Foster, University of Chicago and Argonne National Laboratory, and Yong Zhao, Microsoft Corporation

 

Tags:



Null
 iSGTW 1 September 2010

Feature - The forecast before the storm

Q&A - Joe Hellerstein on cloud programming

Q&A - People behind EGI: Steve Brewer steps in as the voice of the user

Poll of the week - Rock stars of scientific computing

Videos of the week - NoHardware.com destroys server huggers' equipment

 Announcements

Symposium on Authentication Technologies for Research and Education abstracts due

Grace Hopper early bird registration due

Gordon Conference 2010 abstracts due

Jobs in distributed computing

 Subscribe

Enter your email address to subscribe to iSGTW.

Unsubscribe

 iSGTW Blog Watch

Keep up with the grid’s blogosphere

 Mark your calendar

September 2010

August 29-Sept 3, CERN School of Computing

2-3, Citizen Cyberscience Summit

6-8, IASTED in Botswana

6-9, PRACE Training Week

6-10, GridKa School 2010

13-15, CaBIG

13-16, UK All Hands Meeting

14-17, EGI Technical Forum

20-24, Cluster 2010

27-29, ICT 2010

21-23, Cybera Summit 2010

More calendar items . . .

FooterINFSOMEuropean CommissionDepartment of EnergyNational Science Foundation RSSHeadlines | Site Map