 | “You don’t have to have a PhD to be interested in data and to want to analyze it. ” Bob Grossman is working to make large data sets easily accessible and publicly available, even to children. Stock image from sxc.hu |
So you’ve used a grid to split up your job, process it faster, then return your results. You now have a nice chunky terabyte of data. What do you do with it?
Bob Grossman, Director of the National Center for Data Mining at the University of Illinois, Chicago, U.S., says the answer is share, share, share.
“In terms of impact on society, the ability to use transparently other people’s data is going to be transforming,” Grossman says. “It is about ‘network effects’,” he continues. “In the same way that a network becomes more interesting as more people join it, you can draw more interesting conclusions about your own data if you put it into the context of other people’s data.”
A fine notion in principle But how can you get these network-busting bundles of new data to the people who need them?
Simple, says Grossman. You just send them, to everyone and anyone who might like to take a look. “Our motivation for the last ten years has been to create a web for data, so it’s easy to browse, explore and download it. The system we built, called DataSpace, still controls who can write data, but we encourage anyone in the world to read it.” Driven by this ultimate goal, Grossman turned his eye to the networks: could they distribute large sets of data across thousands of miles, and all without wasting a second? No, not really, not at all. |