Strozzi NoSQL
An editor has nominated this article for deletion. You are welcome to participate in the deletion discussion, which will decide whether or not to retain it. |
NoSQL is a shell-based relational database management system that runs under Unix-like operating systems, or others with compatibility layers (e.g., Cygwin under Windows).[1] Its name merely reflects the fact that it does not express its queries using Structured Query Language; the NoSQL RDBMS is distinct from the circa-2009 general concept of NoSQL databases, which are typically non-relational, unlike the NoSQL RDBMS.
Construction
NoSQL uses the operator-stream paradigm, where a number of "operators" perform a unique function on the passed data. The stream used is supplied by the UNIX input/output redirection system so that over the pipe system, the result of the calculation can be passed to other operators. As UNIX pipes run in memory, it is a very efficient way of implementation.[1][2]
NoSQL, with development led by Carlo Strozzi, is the latest and perhaps the most active in a line of implementations of the stream-operator database design originally described by Evan Shaffer, Rod Manis, and Robert Jorgensen in a 1991 Unix Review article and an associated paper. Other implementations include the Perl-based rdb, a commercial version by the original authors called /rdb, and Starbase[3] , a version with added astronomical data operators by John Roll of Harvard and the Smithsonian Astrophysical Observatory. Because of its strengths in dealing with pipe data, most implementations are a mixture of awk and other programming languages, usually C or Perl.
The concept was originally described in a 1991 Unix Review article, and later expanded in a paper (see reference above), as well as in the book, "Unix Relational Database Management". NoSQL (along with other similar stream-operator databases) is well-suited to a number of fast, analytical database tasks, and has the significant advantage of keeping the tables in ASCII text form, allowing many powerful text processing tools to be used as an adjunct to the database functions themselves. Popular tools for use with NoSQL include Python, Perl, awk, and shell scripts using the ubiquitous Unix text processing tools (cut, paste, grep, sort, uniq, etc.)
NoSQL is written mostly in interpretive languages, slowing actual process execution, but its ability to use ordinary pipes and filesystems means that it can be extremely fast for many applications when using RAM filesystems or heavily leveraging pipes, which are mostly memory-based in many implementations.[4]
Philosophy
The reasons of avoiding SQL are the following:[4]
- Complexity: Most commercial database products are often too costly for minor projects, and free databases are too complex. They also do not have the shell-level approach that NoSQL has.
- Portability:
- Data: The data from NoSQL can easily ported to other types of machines, like Macintoshes or Windows computers, since tables exist as simple ASCII text and can be easily read from or redirected to files at any point in processing.
- Software: NoSQL can run on any UNIX machine that has the Perl and the AWK programming languages installed, and perhaps even on the Cygwin UNIX-like environment for Microsoft Windows.
- Unlimited: NoSQL has no arbitrary limits, like a data field size, column number, of file size limit, and can principally work where other products cannot. (Number of columns in a table may actually be limited to 32.768 by some implementations of the AWK1 programming language).
- Usability: With its straight forward and logical concept, NoSQL can easily be used by non-computer people. For instance, rows of data are selected with the 'row' operator, columns with the 'column' operator.
In contrast to other RDBMS, NoSQL has the full power of UNIX during application development and usage. Its user interface uses the UNIX shell. So, it is not necessary to learn a set of new commands to administer the database. From the view of NoSQL, the database is not more than a set of files similar to any other user file. No scripting or other type of database language is used besides the UNIX shell. This shell-nature encourages casual use of this database, which makes it's use familiar, resulting in formal use. In other words, NoSQL is a set of shell routines that access normal files of the operating system.[4]
Examples
Often, NoSQL databases are document oriented. To retrieve all information on a particular employee, a NoSQL DB will provide a language such as JavaScript or XQuery with functions to retrieve and manipulate documents:
rather than SQL:
select e.*, a.*, mgr.* from EMPLOYEES e, ADDRESSES a, MANAGERS mgr WHERE <join clauses> ...
A document-oriented NoSQL database often retrieves a pre-connected document representing the entire employee:
$e = doc("/employee/emp_1234") return $e/address/zip
Also, stream-based databases fall into the NoSQL category.
The stream-operator paradigm differs from conventional SQL, but since both are relational, it is possible to map operators to their SQL equivalents:
SQL | NoSQL or /rdb |
---|---|
select col1 col2 from filename | column col1 col2 < filename |
where column - expression | row ’column == expression’ |
compute column = expression | compute ’column = expression’ |
group by | subtotal |
having | row |
order by column | sorttable column |
unique | uniq |
count | wc -l |
outer join | jointable -al |
update | delete, replace |
nesting | pipes |
See also
References
- ^ a b "NoSQL: a non-SQL RDBMS". http://www.strozzi.it/. Retrieved 2011-04-05.
{{cite web}}
: External link in
(help)|location=
- ^ "NoSQL RDBMS". http://twit88.com/: twit88.com. Retrieved 2011-04-06.
It uses the "Operator-Stream Paradigm" described in "Unix Review", March, 1991, page 24, entitled "A 4GL Language". There are a number of "operators" that each perform a unique function on the data. The "stream" is supplied by the UNIX Input/Output redirection mechanism. Therefore each operator processes some data and then passes it along to the next operator via the UNIX pipe function. This is very efficient as UNIX pipes are implemented in memory. NoSQL is compliant with the "Relational Model". The key feature of NoSQL (and other similar packages mentioned in this manual), is its close integration with UNIX. Real-world problems are typically more complex than the data models provided by many DBMS. Actual applications, and Web-based ones are no exception, are complex puzzles made up of many small pieces, several of which are data-related. Unlike other fourth generation systems, NoSQL is an extension of the UNIX environment, making available the full power of UNIX during application development and usage.
{{cite web}}
: External link in
(help); line feed character in|location=
|quote=
at position 502 (help)CS1 maint: location (link) - ^ Roll, John. "Starbase: A User Centered Database for Astronomy" (PDF). Retrieved 3 May 2011.
- ^ a b c "NoSQL: a non-SQL RDBMS: Why NoSQL, in the first place?". http://www.strozzi.it/. Retrieved 2011-04-05.
A good question one could ask is "With all the relational database management systems available today, why do we need another one ?". The main reasons are:
- Several times I have found myself writing applications that needed to rely upon simple database management tasks. Most commercial database products are often too costly and too feature-packed to encourage casual use. There is also plenty of good free databases around, but they too tend to provide far more than I need most of the times, and they too lack the shell-level approach of NoSQL. Admittedly, having been written mostly with interpretive languages (Shell, Perl, AWK), NoSQL is not the fastest DBMS of all, at least not always (a lot depends on the application).
- NoSQL is easy to use by non-computer people. The concept is straight forward and logical. To select rows of data, the 'row' operator is used; to select columns of data, the 'column' operator is used.
- The data is highly portable to and from other types of machines, like Macintoshes or DOS computers.
- The system should run on any UNIX machine (that has the PERL and the AWK Programming Languages installed). Some users have reported it to work also on the Cygwin UNIX-like environment for Microsoft Windows.
- NoSQL essentially has no arbitrary limits and, at least in principle, it can work where other products can't. For example there is no limit on data field size, the number of columns, or file size (the number of columns in a table may actually be limited to 32.768 by some implementations of the AWK1 programming language).
{{cite web}}
: External link in
(help); line feed character in|location=
|quote=
at position 156 (help)
Further reading
- Ayers, Larry (November 1998). "How Not To Re-Invent The Wheel". Linux Gazette. Retrieved 2011-05-03.
- Litt, Steve (April 2007). "NoSQL: The Unix Database (With awk)". Linux Productivity Magazine. Retrieved 2011-05-03.
- Paterno, Giuseppe (November 1, 1999). "NoSQL Tutorial". Linux Journal. Retrieved 2011-05-03.
External links