| Indiana University’s Data Capacitor is a high speed/high bandwidth storage system for research computing that serves all IU campuses and NSF TeraGrid Users. At peak performance, the Data Capacitor has a 14.5 gigabyte per second aggregate transfer rate per second. Image courtesy of Indiana University |
Bandwidth Challenge First place in this year’s Bandwidth Challenge went to a team using Indiana University’s Data Capacitor, a system designed to store and manipulate massive data sets. The team achieved a peak transfer rate of 18.21 Gigabits per second out of a possible maximum of 20 Gigabits per second. This performance was nearly twice the peak rate of the nearest competitor. The team achieved an overall sustained rate of 16.2 Gigabits per second—roughly equivalent to sending 170 CDs of data per minute—using a transatlantic network path that included the Internet2, GÉANT and DFN research networks.
“This project simultaneously pushed the limits of networking and storage technology while demonstrating a reproducible model for remote data management. Best of all, we did this using a variety of research applications that we support every day at Indiana University,” said Data Capacitor and Bandwidth Challenge project leader Stephen Simms. The Data Capacitor is powered by the open source Lustre file system and the Linux operating system. It is currently accessible to researchers though IU’s participation in the TeraGrid. The winning team was led by Indiana University, with partners from the Technische Universitaet Dresden, Rochester Institute of Technology, Oak Ridge National Laboratory and the Pittsburgh Supercomputing Center. BACK TO TOP Cluster Challenge A team of undergraduates from the University of Alberta, Canada, won the inaugural Supercomputing cluster challenge, a three-day cluster-building marathon. Competing teams assembled small clusters on the exhibit floor, running benchmarks and applications selected by industry and high performance computing veterans. Power consumption was limited: each team was allowed just a single 26 amp, 110 volt circuit. Clusters were judged on the speed of benchmarks and throughput of application runs. The University of Alberta’s winning system was a 64-core (Xeon 2.66GHz) system with 20Gbit InfiniBand and 16GB of memory running Scientific Linux. The competition was designed to show how accessible clusters have become: the systems built by the student teams would have been considered top-of-the-line super computers just ten years ago. BACK TO TOP Storage Challenge This award for the most effective approach to using large-scale storage for high-performance computing went to a novel software framework called ParaMEDIC, or Parallel Metadata Environment for Distributed I/O and Computing. The ParaMEDIC software was used to search the sequences of all completed microbial genomes to discover missing genes and speed future searches by generating a complete genome similarity tree. The ParaMEDIC software framework used a semantics-based approach to create a metadata representation that was four orders of magnitude smaller than the actual output data. “Using ParaMEDIC, the entire genome similarity tree, corresponding to a petabyte of data, can fit into a 4-gigabyte iPod nano,” said team member Pavan Balaji of Argonne National Laboratory. This entire task required many millions of CPU-hours of computational capability and generated a petabyte of uncompressed output. Since not many supercomputer centers provide both the computational and storage resources required for this task simultaneously, the research team relied on a worldwide supercomputer that aggregated the compute resources from various locations within the U.S. and the TSUBAME storage resources at the Tokyo Institute of Technology in Japan, with technical support from Sun Microsystems. “In total, we relied on six U.S. supercomputing institutions and accessed over 12,000 processors across eight supercomputers. The ParaMEDIC framework then improved compute utilization from 10 percent to nearly 100 percent for the compute resources and storage bandwidth utilization from 0.04 percent to 90 percent for the storage resources,” said Wu-cun Feng of Virginia Tech. The team comprised researchers from Argonne National Laboratory, Virginia Tech and North Carolina State University. BACK TO TOP
|