| Cluster-to-cluster data movement with GridFTP using up to 64 nodes on each end of TeraGrid’s 30 Gbit/s wide area network between Urbana, IL and San Diego, CA.
Images courtesy of Raj Kettimuthu (plot) and Corky and Holly Siegel (duck). | GridFTP, a data transfer protocol optimized for high-bandwidth wide-area networks, handles an average of more than 2.5 million data transfers a day.
The Large Hadron Collider, the Southern California Earthquake Center, the Relativistic Heavy Ion Collider, the Laser Interferometer Gravitational Wave Observatory, the European Space Agency, the Disaster Recovery Center in Japan and even the British Broadcasting Corporation use it.
Based on the old workhorse FTP, GridFTP supports reliable and restartable data transfers and provides extensions for high-performance operation and security. It is a specification (meaning anyone can write code to implement it) for which standards are defined within the Open Grid Forum. Globus Alliance provides a reference implementation, and frequent upgrades keep it current and competitive.
To overcome the performance limitations of TCP (a veteran core transmission protocol of the Internet protocol suite), GridFTP supports parallel TCP streams over high-speed wide-area network links and allows users to set optimal TCP buffer size for a transfer. These features offer roughly an order of magnitude (a factor of ten) improvement in performance over the standard FTP. Globus GridFTP also operates on UDT, a newer transfer protocol designed for extremely high speed networks.
GridFTP supports coordinated data transfer using multiple computer nodes at both the source and destination, adding another order of magnitude improvement in performance for network links that support much higher data rates than individual nodes at either end. With this feature, GridFTP has delivered over 25 Gb/s on a 30 Gb/s TeraGrid link using 32 nodes at both ends. |