Opinion - Reaching for the Exa-scale, with volunteer computing
(Editor's note: David Anderson is the founder of the popular volunteer computing platform known as BOINC, or the Berkeley Open Infrastructure for Network Computing. Here, he peers into his crystal ball to predict the direction of volunteer computing, especially as new, high-speed graphics processing units come into the market.)
Remember your prefixes? Kilo, mega, giga, tera, peta . . . exa? Each denoting a thousand times more than the one before?
Today, the average personal computer can do a few GigaFLOPS (the acronym refers to doing one billion FLoating-point Operations Per Second). A modest cluster might do one thousand GigaFLOPS, or 1 TeraFLOPS. And for several years, one thousand TeraFLOPS, or one PetaFLOPS, was the Holy Grail of high- performance computing.
That one PetaFLOPS milestone was reached in the past year, first by Stanford's Folding@Home (a volunteer computing project), then by various volunteer computing projects using BOINC, a middleware system for volunteer computing developed by my research group at the University of California at Berkeley. More recently, that same milestone was attained by IBM’s RoadRunner supercomputer.
Yet the demand of Science for computing power, grows without bound.
How can we reach the next milestone: 1,000 PetaFLOPS, or one ExaFLOPS?
At this scale, clusters and supercomputers run into problems with power consumption and heat dissipation, so Exa-scale computing using these approaches is probably many years away.
However, there may be a much faster and cheaper path to Exa-scale, using a combination of volunteer computing and graphics processing units (GPUs).
GPUs are the chips in PCs, laptops and game consoles that render 3-D graphics. The major GPU vendors are NVIDIA and ATI (acquired in 2007 by AMD). Architecturally, GPUs consist of hundreds of processors working in parallel on multiple data streams. This reduces the need for large caches, so GPUs can devote most of their transistors to arithmetic processing.
GPUs also have less need for backwards compatibility than CPUs.
Because of these advantages, GPUs are faster than CPUs: the latest GPU can do 500 GigaFLOPS for some applications, while the latest CPU does about 10 GigaFLOPS.
And this gap is widening.
The future, with GPUs
Over the last few years, GPUs and CPUs have increased exponentially in speed, but the doubling time for GPUs has been about 8 months, compared to 16 months for CPUs.
GPUS have also overcome other hurdles that had previously held them back from wide acceptance: Early GPUs were hard to program for scientific applications because they were deigned to do graphics rendering—producing realistic two-dimensional images of complex, 3-D scenes. The scientific algorithms had to be expressed in terms of graphics primitive, often a very difficult task.
This has changed.
GPU architectures have steadily become more general-purpose, and the latest models do double-precision floating-point math, which is needed for many scientific applications. In 2007, NVIDIA released a system called CUDA that allows GPUs to be programmed in the C language, making it much easier for scientists to develop and port applications to run on GPUs.
CUDA also has the benefit of making it much easier for scientists to develop and port applications to run on GPUs. Scientists have already used CUDA for applications in molecular dynamics, protein structure prediction, climate and weather modeling, medical imaging, and many other areas.
BOINC has recently added support for GPU computing. The BOINC client detects and reports GPUs, and the BOINC server schedules and dispatches jobs appropriately. If configured to do so, BOINC can even use a PC's GPU 'in the background' while the computer is in use. Already, one BOINC-based project (http://GPUgrid.net) has CUDA-based applications, and several other projects will follow suit shortly.
These trends could be the raw ingredients for Exa-scale computing. For example, if the average speed of a late-model GPU reaches 1 TeraFLOPS, and 4 million GPU-equipped PCs participate in volunteer computing and these PCs are available an average of 25% of the time, then we get 1 ExaFLOPS.
Could this scenario be realized in the near term, say in 2010? In my opinion, it's near-certain that GPUs will reach 1 TeraFLOPS by then, and a large percentage of PCs will be available to run BOINC (although the advent of ‘green computing’ will decrease availability somewhat). The hard part will be getting 4 million GPU-equipped volunteered PCs; there are currently about 1 million PCs participating, not all of them GPU-equipped, so an order-of-magnitude increase is needed.
Achieving this may require the assistance of a PC vendor or chip manufacturer.
In summary: the combination of volunteer computing and GPUs can feasibly provide Exa-scale computing power for science in a remarkably short time frame, years ahead of other paradigms. Scientists wanting to share in this resource can do so by developing GPU-enabled applications and deploying them in BOINC-based volunteer computing projects.
—David Anderson,Research Scientist, University of California, Berkeley, and founder of the Berkeley Open Infrastructure for Network Computing (BOINC)