iSGTW - International Science Grid This Week
iSGTW - International Science Grid This Week
Null

Home > iSGTW 18 July 2007 > iSGTW Opinion - From Moore to Metcalf: the network as the next database platform

 

Opinion - From Moore to Metcalf: the network as the next database platform


What’s the best way to count how many of each type of jelly beans are in a jar? Traditional database systems work by emptying the jar of beans, counting them, and then returning them to the jar.
Stock image from sxc.hu

Innovation in database systems technology has traditionally been driven by the push and pull between Moore’s law and Shugart’s law, which describe the competing exponential growth in both computing power and the volume of the data over which that computing must be applied.

Increasingly, however, it is Metcalf’s law that is putting pressure on state-of-the-art data management.  

Metcalf’s law describes the network effects that cause networks to continually expand.

The practical impact of this law is that data-intensive applications are becoming increasingly distributed. Nowhere is this trend more apparent than in the area of scientific grid computing.

As science has become more data-centric, scientific users have come to rely upon relational database technology and the SQL query language as key tools in their data analysis processes.

The inherent benefits of SQL include: productivity due to powerful support for bulk data operations; ease of maintenance-end evolution due to declarative programming; and efficiency due to sophisticated optimization techniques developed over several decades.

In distributed environments, however, current database technology acts only as an endpoint.

Thus, many users still rely on hand-coded solutions for tasks such as filtering, cleaning, and event detection/response in high-volume data streams flowing through networks. 

Stream Query Processing turns tradition on its head

An emerging database technology, called Stream Query Processing, has the potential to change all this.

With Stream Query Processing, the traditional database arrangement of persistent data waiting for queries to arrive is turned on its head.  In a stream processing system, it is the queries that are persistent, and processing that is initiated by the arrival of new data.

This inverse structure allows queries to continuously generate incremental answers based on the data that have been seen “so far”.  

Stream query processing works by counting each bean as it enters the jar, enabling on-line processing.
Stock image from sxc.hu
As a result, interactive applications and applications that demand low latency can be written in the familiar and powerful SQL language. 

Furthermore, this approach can provide significant performance benefits that result from the ability to optimize all of the queries as a unit—thereby avoiding redundant work—and the intelligent and adaptive placement of query functionality in the network.

Stream Query Processing is an emerging technology that grew out of research performed largely at universities over the past decade. 

Early prototypes spawned numerous startup companies, and recent product announcements by existing enterprise software players have led to an expansion of interest and choices in the market.   

Grid computing users have come to appreciate the benefits of SQL processing for post hoc data analysis and manipulation, but have believed these benefits were unavailable for on-line processing.  

Stream Query Processing with full support for the SQL language removes this unnecessary barrier to productivity and performance.  

As such, it represents the natural adaptation of database technology to the increasingly distributed world prescribed by Metcalf’s law.

- Michael J. Franklin

Michael J. Franklin is a professor of computer science at the University of California, Berkeley, U.S., and chief technical officer of Truviso, Inc. He was a keynote speaker at last month's IEEE International Symposium on High Performance Distributed Computing

 





Null
 iSGTW 1 September 2010

Feature - The forecast before the storm

Q&A - Joe Hellerstein on cloud programming

Q&A - People behind EGI: Steve Brewer steps in as the voice of the user

Poll of the week - Rock stars of scientific computing

Videos of the week - NoHardware.com destroys server huggers' equipment

 Announcements

Symposium on Authentication Technologies for Research and Education abstracts due

Grace Hopper early bird registration due

Gordon Conference 2010 abstracts due

Jobs in distributed computing

 Subscribe

Enter your email address to subscribe to iSGTW.

Unsubscribe

 iSGTW Blog Watch

Keep up with the grid’s blogosphere

 Mark your calendar

September 2010

August 29-Sept 3, CERN School of Computing

2-3, Citizen Cyberscience Summit

6-8, IASTED in Botswana

6-9, PRACE Training Week

6-10, GridKa School 2010

13-15, CaBIG

13-16, UK All Hands Meeting

14-17, EGI Technical Forum

20-24, Cluster 2010

27-29, ICT 2010

21-23, Cybera Summit 2010

More calendar items . . .

FooterINFSOMEuropean CommissionDepartment of EnergyNational Science Foundation RSSHeadlines | Site Map