Share |

Research Report: Campus bridging, networks, and data

The Vasco de Gama bridge spans the Tagus River in Lisbon, Portugal. CC BY-NC 3.0 Aires Dos Santos.

How should the university campus interface with national cyberinfrastructure (CI)? What role do networks play in today's data-intensive environment? These are questions that matter, which is why iSGTW staff were pleased to see this third and final report on campus bridging. This report focuses on data and networks. Read on for iSGTW’s summary of the report. You can also read our articles about the reports on how campus bridging relates to leadership here, and software and services here.

Today’s system of national and international cyberinfrastructure is increasingly complicated and growing. We have a good deal of cyberinfrastructure available relative to data and networks nationally. What is greatly needed, and currently lacking, is a national architecture for cyberinfrastructure that would allow the sort of seamless integration of local, national, and international resources that is increasingly invaluable to scientists.

What is campus bridging?

The goal of campus bridging is to enable the seamlessly integrated use among: a scientist or engineer’s personal cyberinfrastructure; cyberinfrastructure on the scientist’s campus; cyberinfrastructure at other campuses; and cyberinfrastructure at the regional, national, and international levels; so that they all function as if they were proximate to the scientist. When working within the context of a Virtual Organization, the goal of campus bridging is to make the ‘virtual’ aspect of the organization irrelevant (or helpful) to the work of the VO.

–The NSF's Task Force on Campus Bridging

In April of 2010, Indiana University coordinated a workshop, funded in part by the US National Science Foundation, on the data and networking aspects of cyberinfrastructure and campus bridging. The workshop was attended by 45 participants from universities, federal labs, and cyberinfrastructure organizations such as Internet2 and Open Science Grid. Over the course of two days of discussion and presentations, several themes emerged.

Broadly speaking, there are four main impediments to broader adoption of advanced CI – awareness, education, ease of use, and reliability. Researchers are often not aware of the CI that is available for their use. Even those researchers who know about the CI they can access perceive it as hard to use or too much to learn. And when they do choose to use the CI at their disposal, it often breaks or performs poorly.

The gap between what campus, regional, and national resources are available and what is understood generally by researchers is quite large within US higher education. Given this, a first step in effective campus bridging is to focus on education about availability of these resources. For researchers at institutions of higher education to bridge from where they are to the best facilities for their needs as effectively as possible, they must first know what resources are available and appropriate.

Problems with ease of use arise because every new CI tool has a new interface to learn. They are compounded by the fact that in today’s CI environment, switches to new tools often occur rapidly. An evolution to bring in new features and capabilities that is based on a known user interface would be better accepted by researchers; entirely new user interfaces for each application are neither necessary nor helpful.

Reliability is also difficult to achieve because of issues related to interoperability. Today’s CI ecosystem consists of components that must be able to work in concert. Yet, in most cases, each component was developed independently, with minimal information about the architecture of the other components.

Ease of use and reliability each relate in their own way to several areas of concern, including the importance of widespread access to high-speed networks, the concept of a federated distributed file system, data preservation, and continuity of funding for cyberinfratructure.

Effective, efficient federated identity management and authentication was among these. An NSF requirement to employ the InCommon Federation global federated system for identity management for all systems and services it funds, combined with National Institutes of Health adoption of InCommon, should lead the nation to consistent use of a single, interoperable, federated identity system.

The following 13 recommendations arose from the April 2010 workshop:

  1. The National Science Foundation should lead (and fund) the development of a national architecture for cyberinfrastructure that will enable the seamlessly integrated use among: a scientist or engineer’s personal cyberinfrastructure; cyberinfrastructure on the scientist’s campus; cyberinfrastructure at other campuses; and cyberinfrastructure at the regional, national, and international levels; so that they all function as if they were proximate to the scientist.
  2. The National Science Foundation should strengthen funding for the Campus Champions and similar campus-oriented outreach and education programs.
  3. The National Science Foundation must design its cyberinfrastructure programs, including in computational resources, software, networking, storage, and visualization, to incent campus cyberinfrastructure investment. The desired outcome is a balanced and at-least-partially coordinated pattern of investments in campus and national cyberinfrastructure.
  4. The National Science Foundation should fund the architecting, implementation, and ongoing maintenance and improvement of a Campus Bridging Software Stack. This should permit use with Unix-based and other operating systems. It should be standards-based rather than implementation-based. It must be simple to use, secure, and enable effective performance of local cyberinfrastructure.
  5. As part of a strategy of coherence between the National Science Foundation and campus cyberinfrastructure and reducing reimplementation of multiple authentication systems, the NSF should encourage the use of the InCommon Federation global federated system by using it in the services it deploys and supports, unless there are specific technical or risk management barriers.
  6. The National Science Foundation should fund the strengthening of the emerging federated identity, authentication, and authorization infrastructure, with particular regard to improved scalability (including through inter-federation), improved adequacy of authorization in the face of cyberinfrastructure-related applications, and improved security.
  7. Campuses should deploy and operate perfSONAR and related tools to systematically measure, debug, record, and display the measured performance.
  8. The National Science Foundation should create a new program funding high-speed (currently 10 Gbps) connections from campuses to the nearest landing point for a national network backbone. The design of these connections must include support for dynamic network provisioning services and must be engineered to support rapid movement of large scientific datasets.
  9. The National Science Foundation should fund the architecting, implementation, and operations of a wide-area federated distributed file system for use by the US open research community. The resulting system should support federated identity, very high-speed transfer of data (files or blocks) among major repository components of the system, and replication of files to further robustness and performance.
  10. The National Science Foundation should fund the architecting of cost-effective ways to archive and preserve data collections, and fund at least some facilities for archiving important data at the national level.
  11. The National Science Foundation should fund development of software tools and technology needed for effective remote visualization, and the NSF and institutions of higher education should fund the technology implementation and infrastructure needed for effective remote visualization.
  12. The National Science Foundation should encourage and fund the training of more researchers of all types (especially staff) in computational and data-intensive science and engineering.
  13. The National Science Foundation should provide more funding for staff supporting use of cyberinfrastructure in research and in particular should provide funding that is more stable and predictable over time.

For the full report, visit the report's page here.

Your rating: None Average: 3 (1 vote)

Comments

Post new comment

By submitting this form, you accept the Mollom privacy policy.