Jump to content

StreamSQL

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Dlyons493 (talk | contribs) at 22:07, 28 March 2006 ('''StreamSQL'''). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

StreamSQL

StreamSQL is a query language that extends the industry-standard SQL to process real-time data streams.

Led by Dr. Michael Stonebraker, a team of 30 professors and students from M.I.T., Brown University, and Brandeis University worked collaboratively from 2001 through 2003 to develop the core principles behind StreamSQL.

Just as the inherent value of SQL is its ability to issue queries against stored data, this same querying capability must also exist for data streams. Thus, in order to go beyond finite stored dataset records, StreamSQL manages continuous event streams and time-based records. StreamSQL retains the capabilities of SQL while adding new capabilities such as a rich windowing system, the ability to mix stored data with streaming data, and the power to extend the primitives to include custom logic, such as analytic functions.


StreamSQL offers the following:


A familiar, standard paradigm - SQL’s combination of functionality, power, and relative ease-of-use has made it an enduring standard for complex data transformations. StreamSQL extends the standard SQL querying model and operators to also perform processing on continuous data streams.


Querying over time windows - StreamSQL extends the semantics of standard SQL (which assumes records in a finite stored dataset) by adding rich windowing constructs and stream-specific operators. With StreamSQL, the window construct defines the “scope” of a multi-message operator such as an aggregate or a join, letting it know when to finish an operation and output an answer. Windows are definable over time, number of messages, or breakpoints in other message attributes.


Operators - StreamSQL operators provide the capability to filter streams, merge, combine, and correlate multiple streams, and run time-window-based aggregations and computations on real-time streams or stored tables. The operators manage stream disorder and late or missing data, and also access and manipulate in-memory and external storage.


Customization & Extensibility - Because the StreamSQL operator set is extensible, developers can easily achieve new processing functionality within the system, such as implementing a proprietary analysis algorithm, or creating user-defined aggregates, functions, and custom operators.