Talk:Comparison of relational database management systems
Useless columns?
The price column is kind of useless because 1) DB server prices vary widely based on edition, number of users, etc., and are too complicated to go in a table cell by themselves and difficult to compare with other servers due to not being "apples-to-apples," 2) most of the "prices" are question marks, 3) even if they were filled in they would be obsolete quickly, and 4) the header says USD but the only actual price listed is in Euros.
ISTM the point of the author was to distinguish free vs. non-free (beer) DBs, but the licensing column already does that.
Also, I have NPOV concerns about "Proprietary" due to the content of the linked article. I'm not sure what a better link would be, however.
- I agree, let's remove it. And we should consider moving the "Index" and "Table" columns of the "Database object" table, since all RDBMSes support them. TroelsArvin 10:15, 8 Mar 2005 (UTC)
- Regarding indexes, there could be a single column listing which index types are supported, I guess, but even that suggests that more types are better, which isn't (IMHO) necessarily the case, especially when viewed from a usability point of view. I'd be OK with dropping the columns. -Craig Stuntz
Another concern I have is with "schema." This term is vague (it can mean a specific server feature or just a database's metadata), which makes the column useless. -Craig Stuntz
- Schema refers to the ability to create database schema, i.e. CREATE SCHEMA / CREATE DATABASE. While most RDBMS supports this, some don't. Anyway, probably not that useful, may drop it as well. --Minghong 09:28, 9 Mar 2005 (UTC)
- I regret that the schema column was removed. I've seen rather ugly DB designs which would have been nicer if schemas were used. TroelsArvin 11:39, 9 Mar 2005 (UTC)
Nobody seems to oppose removing the price column. I'll remove it shortly, if noone steps up and complains. TroelsArvin 19:15, 9 Mar 2005 (UTC)
Revert index
What's a "revert index"? TroelsArvin 13:02, 9 Mar 2005 (UTC)
- Revert index is available in Oracle. "Creating a reverse key index reverses the bytes of each column key value, keeping the column order in case of a composite key". For example, the value of the primary key ranges from 7000 to 8000. If a regular index is used, the B-tree (or R-tree) will be unbalanced. If they are reverted, i.e. 0008, 9997, .... 0007, the tree will be more balanced. --Minghong 15:29, 9 Mar 2005 (UTC)
- Ah, a reverse index. I've changed the page. TroelsArvin 18:31, 9 Mar 2005 (UTC)
- Oops, sorry for the typo. --Minghong 11:01, 10 Mar 2005 (UTC)
Question
Nice work! Do you have any plans to include information on scalability, replication, clustering and back-ups? For an example of a DB comparison structure see also: http://det-dbalice.if.pw.edu.pl/ttraczyk/db_compare/db_compare.html (if the link doesn't work see Google Cache).
- Need to do some research first. :-P --Minghong 09:28, 9 Mar 2005 (UTC)
- Some of these concepts are rather ambiguous (e.g., replication isn't just "replication", but includes variants such as synchronous/asynchronous), and some of the features are available only in an "enterprise"-edition of a DBMS, or as add-ons. I don't think that this page should be filled with debatable features. TroelsArvin 11:39, 9 Mar 2005 (UTC)
What about MaxDB? (see i.e.: http://www.torsten-horn.de/techdocs/sql.htm#Vergleich-MySQL-PostgreSQL-MaxDB https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/documents/a1-8-4/MaxDB%20-%20An%20Enterprise%20Open%20Source%20Database.article)
- Nobody would object to that, but someone has to do the work. TroelsArvin 11:39, 9 Mar 2005 (UTC)
OS support
The OS support matrix plots support for un-emulated OS support. Sybase explicitly mentions that it needs to be run with the Linux ABI FreeBSD add-on. In some definitions of "emulation", that would exclude Sybase on FreeBSD. Parts of FreeBSD's documentation [use the word "emulator"] for the Linux ABI support. TroelsArvin 11:39, 9 Mar 2005 (UTC)
- Does that mean FreeBSD support should be "No" for Sybase and Oracle? --Minghong 15:37, 9 Mar 2005 (UTC)
- I have already set BSD support to "no" for Oracle, because Oracle doesn't mention BSD as a supported platform. For Sybase, I'm less sure: Sybase runs on FreeBSD through the Linux ABI compatibility libraries, so one might call it emulation. However, as Sybase officially mentions FreeBSD as supported, I'm inclined to let it stay as "yes". TroelsArvin 18:31, 9 Mar 2005 (UTC)
Index organized tables
I noticed that index organized tables (IOTs) were introduced, and then removed again. The reason for removal was that only Oracle supports them. This is wrong: "Clustering indexes" are a commonly known DBMS concept, and PostgreSQL (and probably others) supports them via the CLUSTER command. In PostgreSQL, however, row changes (including insertions/deletions) don't reorganize the table automatically (potentially a very I/O-intensive operation), so the CLUSTER-command needs to be re-executed when relevant. I also think that InnoDB tables in MySQL are implicitly index-organized (around the primary key, if nothing else is specified). TroelsArvin 12:46, 9 Mar 2005 (UTC)
- Are they the same thing? Even so, the name is different, making it hard to be compared. --Minghong 15:37, 9 Mar 2005 (UTC)
- Vendors use lots of different terms for the same concepts; one has to get used to that. However, I don't think that clustering index support is important. TroelsArvin 18:31, 9 Mar 2005 (UTC)
Features in beta versions
I think it's disturbing that some features are mentioned as being available, but only in experimental versions. We are basically talking about the future (not facts) when planned features are mentioned. DBMS vendors are constantly working on new features, so the page could turn into a jungle of footnotes. Also: Planned features don't always turn into implemented features, as initially expected.
If the page keeps notes about version differences, it would be much more useful to introduce notes about when a certain feature was added (i.e. notes about the past). TroelsArvin 12:58, 9 Mar 2005 (UTC)
- The table will be based on the fact in the latest stable version. But footnote is free to list any version (so as to tell others than "it will have that feature really soon". --Minghong 15:37, 9 Mar 2005 (UTC)
- Anything can happen really soon. TroelsArvin 18:31, 9 Mar 2005 (UTC)
Bitmap indexes
The DB2 support for bitmapped indexes is questionable, see [this usenet discussion]. TroelsArvin 13:06, 9 Mar 2005 (UTC)
B-tree or R-tree?
While B-tree (or B+-tree) is the simplest kind of index, I think most non-toy RDBMS would use R-tree instead. That's why I haven't specify whether it is B+-tree or R-tree. --Minghong 15:20, 9 Mar 2005 (UTC)
- R-tree indexes are used for geometric data. It doesn't make sense to call it a "basic" index type. Actually, it would be relevant to add a column marking R-tree support in the different DBMSes. TroelsArvin 18:31, 9 Mar 2005 (UTC)
- I've added a column for support of indexing of geometrical data. TroelsArvin 09:26, 10 Mar 2005 (UTC)
- Really? Isn't multi-column index a R-tree (as R-tree, in simple terms, multi-dimensional B-tree)? e.g.
CREATE UNIQUE INDEX emp_pk_idx ON emp( dept_id, staff_id );
I think that's a fundamental data structure, not necessary related to geometric data (although it certainly helps). --Minghong 10:18, 10 Mar 2005 (UTC)
- It's true that an R-tree an be seen as a generalization of B-trees. But that doesn't mean that an index covering multiple columns is automatically an R-tree, as far as I know.TroelsArvin 10:47, 10 Mar 2005 (UTC)
- Alright, it seems that some RDBMSes uses B-tree even when the index involves multiple columns. (I just wonder how... Multiple B-trees? B-tree is single dimension only...) P.S. I've added hash index since it is the most fundamental index. --Minghong 10:55, 10 Mar 2005 (UTC)
- You changed my "Geometric" index type to "R-tree". I disagree with that change. DB2, for example, has an index type for geometric data, but it's not strictly R-tree based (as far as I know). What's interesting is if the DBMS offers (at least one) multidimensional index type.TroelsArvin 11:09, 10 Mar 2005 (UTC)
Geometric is not specific enough, and R-tree is not just for geometric data. Unless there is an index type called "geometric", I don't think it should be used. Perheps a new column called "GIT" (DB2) [1] or "GiST" (PostgreSQL) [2]? --Minghong 11:35, 10 Mar 2005 (UTC)
- I have yet to see DBMS documentation mention R-trees in other contexts than geometry. I really think that the "R-tree" column is uninteresting in itself now, (and I suggest that it be removed). Exactly like I wouldn't care exactly which index type and implementation a DBMS uses for full-text indexing, as long as it supports efficient full-text index based queries. (A relevant column-addition candidate, by the way.)TroelsArvin 12:00, 10 Mar 2005 (UTC)
- I'm aware of full-text indexing. However there is no index type known as "full-text", i.e.
CREATE FULLTEXT INDEX...
orCREATE INDEX ... FULLTEXT ...
? It should be put under "special features" or "other features" instead. --Minghong 12:30, 10 Mar 2005 (UTC)
- I'm aware of full-text indexing. However there is no index type known as "full-text", i.e.
Fundamental RDBMS Features vs. Implementation Details
I have a general concern about listing implementation details such as index types and partitioning in a comparison article. Features like conformity to the relational model, SQL support, etc., are fundamental properties of RDBMSs and should be listed here. Partitioning, OTOH, is a kludge required because some DB servers can't perform acceptably (use whatever definition of "acceptably" you care to, here) without using it in some cases. People don't choose a DB server because it supports partitioning, they choose it because it allows them to quickly and concurrently manage data. If partitioning helps, that's fine, but you can't tell from looking at a chart indicating whether partitioning is supported and know whether it's required to get acceptable performance from that particular server.
In short, I think the most valuable comparison is one which sticks to the issues which affect all servers, such as price, licensing, relational model features, standard interface support (e.g., JDBC, ODBC, ADO.NET, etc.).
- I had thought about those programming interfaces. But since most possible all of them (except ADO.NET), there is no point listing them. --Minghong 10:24, 10 Mar 2005 (UTC)
Stored procedures vs. user defined functions
The page currently has a "User defined functions" column and a "Stored procedure" column. If we want to distinguish between user defined functions and user defined procedures, then the "Stored procedure" column should be renamed. But before I do that: Does anyone know why both columns are there? TroelsArvin 18:41, 9 Mar 2005 (UTC)
- A stored procedures is usually PL/SQL (or similar languages); while a user defined function is not (e.g. linking to host language like PHP, or object files like .obj. --Minghong 10:24, 10 Mar 2005 (UTC)
- In SQL, there are two types of "SQL-invoked routines": Functions and procedures. They are almost identical, except for details of how they are invoked and how data are returned. DBMS suport is a bit different for the two routine types, by the way (PostgreSQL supports functions, not procedures, for example). I think you are talking about "external routines" which are routines (functions or procedures) executed by code written in a non-SQL language. What I find interesting to list is:
- trigger support
- support for SQL-routines (whatever kind)
- what external languages are supported
- support for user defined types
- TroelsArvin 10:58, 10 Mar 2005 (UTC)
- In SQL, there are two types of "SQL-invoked routines": Functions and procedures. They are almost identical, except for details of how they are invoked and how data are returned. DBMS suport is a bit different for the two routine types, by the way (PostgreSQL supports functions, not procedures, for example). I think you are talking about "external routines" which are routines (functions or procedures) executed by code written in a non-SQL language. What I find interesting to list is:
- The terminology I've used matches standard SQL. And the same terminology is used in the Oracle, PostgreSQL and DB2 documentation. TroelsArvin 11:21, 10 Mar 2005 (UTC)
I've tried unifying the "Stored procedures" and "User defined functions" columns. TroelsArvin 12:40, 10 Mar 2005 (UTC)
- Minghong, you have undone my unification, and then introduced two errors, and general confusion:
- Error 1: The page currently says: "User defined function refers to external routines written in the host languages, such as C, Java, Cobol, etc.". This is wrong: A user defined function can be written in either SQL or an external language. If we want to display the difference between functions written in SQL and external binaries/scripts, then the term "external" needs to be emphasized, as "user defined function" is too vague.
- Error 2: The page now states that PostgreSQL supports procedures - which is wrong (PostgreSQL has no 'CALL ...' construct). My previous edit removed that error.
- Worse: The page now has columns for both functions and procedures (in addition to "User defined function"). Why have both, when the difference really is minimal (see my previous comment). And, if both are retained: At least, you need to specify the difference, if you introduce the separation of concepts.
- TroelsArvin 14:26, 10 Mar 2005 (UTC)
- Sorry for undoing your change. But unification makes it ambiguous. Procedure and Function are two things as there are 2 statements:
CREATE PROCEDURE
andCREATE FUNCTION
(there are more, like packages, rountine, etc, but these are not common). If the differences are really that minimal, maybe combine the 2 columns. For the team UDF, checkout the SQLite, Firebird, MySQL, and PostgreSQL manuals. They consistently use the term "UDF" for external functions (well, DB2 seems not to follow this convention). OK, my fault, PostgreSQL does not support procedures (that's why 2 columns is better than 1 column: we can show that difference). --Minghong 00:24, 4 Jan 2003 (UTC)
- Sorry for undoing your change. But unification makes it ambiguous. Procedure and Function are two things as there are 2 statements: