Feature - Data is big news
What will data infrastructures look like 20 years from now?
To find out, the European Commission (EC) assembled a panel of experts to prepare a vision of scientific data e-infrastructures in 2030. The resulting report, overseen by chair John Wood of Imperial College London, was released last week in Brussels.
“Data is big news,” said Wood. In the past, date might have been thrown away but it’s now being kept and recorded as emails, videos, mobile phone data and more. CERN produces petabytes of data each year, but with genome sequencing, electronic health records and upcoming experiments such as the Square Kilometer Array, we’re on schedule to generate hundreds of times more. In the words of Wood: “If you think CERN is big, you ain’t seen nothing yet.”
But with more data comes more challenges. Not just how to store it, but how to access, preserve and trust it.
The report’s contents detail the potential benefits of developing a data e-infrastructure: different domains can collaborate, enabling data to be used, re-used and combined, while allowing for sharing while maintaining data integrity.
But to reach this ideal world, several actions must be taken, says the report. It calls on the EU to develop a framework for data e-infrastructures, along with an international advisory group to plan for it. Training is key, so that researchers recognize the importance of sharing data, and are able reach the information they need — as well as being able to evaluate whether it can be trusted.
A common language
Said Wood: “e-Infrastructures can democratize research.” The report’s authors foresee citizens and industries accessing and contributing to data while policymakers use it to make informed decisions. The panel hopes that by creating an inclusive infrastructure, it will help to benefit society as a whole.
However the group is keen to emphasize that they are not championing one single universal database. Instead, they encourage data initiatives to take into account interoperability and work towards a common language so data can be understood by all.
One simple way to start on this: “Put your data in a repository or on your website so they’re accessible, and make sure that they have identifiers so they can be found for the next 20 years,” said Christopher Best, one of the report’s authors.
“Science has always been based on exchange of information and intense interactions between researchers,” says Neelie Kroes, EC vice president and European Digital Agenda Commissioner. “We should all strive to make real progress towards open access to scientific data.” The findings from the report will now be fed into an upcoming communication on scientific information as well as the Commission’s broader research infrastructure policies.
The report is a result of six months of intense brainstorming and discussions of experts from across the world, encompassing science and the humanities as well as industry and academia, says Mario Campolargo, the EC’s director of emerging technologies and infrastructures.
—Manisha Lalloo, e-ScienceTalk. More on the report can be downloaded here.