Relational model

In database theory, the relational model refers to the type and design of a database system. Other kinds of models are Hierarchical and Network databases which predate the relational model and are still in use today in data centers with high data volume needs or where existing systems are so complex it would be cost prohibitive to migrate to systems employing the relational model. Invented by E. F. Codd, it is particularly suited to dynamic, adhoc retrievels of data and the adhoc addition/subtraction of entities to the database using a specialized, descriptive language called Structured Query Language (or SQL). Usually it is adopted because of its inherent flexibility with regard to future database design and growth and also because of its relative low cost due to market forces.

The underlying rules of the relational model are founded upon a branch of math, developed especially for the model, known as relational calculus. The details of the mathematical logic of the calculus are handled by the database engine; however, in order to avoid anomalies during utilization of the database, specific design constraints of the tables and their fields must be met. (See database normalization)

In the relation model, an entity or table consists of one or more fields (formally known as attributes). Each field is assigned a domain (often called a data type) which describes the values that the field can take on (e.g. Integers, Dates, Strings of varying length, etc.). One of the characteristic traits of the relational model is that each table in the database is linked or related to another table through primary and/or foreign keys.

Example Database

An idealized, very simple example of a description of some tables and their attributes:

Customer(Customer ID, Tax ID, Name, Address, City, State, Zip, Phone)

Order(Order No,Customer ID, Invoice No,Date Placed, Date Promised, Terms, Status)

Order Line(Order No,Order Line No,Product Code, Qty)

Invoice(Invoice No,Customer ID, Order No, Date, Status)

Invoice Line(Invoice No,Line No,Product Code, Qty Shipped)

In this design we have 5 tables (Customer, Order, Order Line, Invoice, and Invoice Line). The bold, underlined field names are primary keys. The non-bold, underlined fields are called foreign keys.

A primary key is a unique identifier which implies that no other row in the table will be duplicated since a primary key is unique. (A row is referred to as a tuple in theoretical settings). Below is a physical depiction of the contents of our example Customer table.

Example: Customer Table


Customer ID     Tax ID              Name                 Adresss                 [More fields....]

==================================================================================================

1234567890      555-5512222         Jo Lee               323 Broadway            ...

2223344556      555-5523232         Dorothy Red          1200 Main Street        ...

3334445563      555-5533322         Linda de la Cruz     871 1st Street          ...

4232342432      555-5325523         E. F. Codd           123 It Way              ...

If we attempted to insert a new customer with the ID 1234567890, this would violate the design of the table since Customer ID is a primary key guaranteed to identify one and only one row since we already have a customer, 1234567890. Every database handles this condition in their own way. However, to enforce this design constraint is to maintain referential integrity.

Foreign keys indicate that the field name is linked to (or relates to) a primary key in another table as in the Customer ID in the Order table (table and entity are used interchangeably). It is this "relating" that is the underpinning idea of the relational model. By relating each table shown above we could query the database for all of the Customers, Orders, and invoices. If we only wanted the rows for a specific customer, we would specify this using a predicate or condition using the where clause(see SQL).

If we wanted to retrieve all of the Orders for Customer 1234567890, we could query the database to return every row in the Order table with Customer ID 1234567890 and join the Order table to the Order Line table based on Order No.

There is a flaw in our table design shown above. What if we wanted to retrieve every Invoice for every Order for a particular Customer ID? Oftentimes an Invoice can be created against 0 or many Order(s). And an order can be paid through 0 or many orders. This is a many-to-many relationship between Order and Invoice. When first designing a database it is important to eliminate many-to-many relationships by normalizing the database design. (Later, when optimizing for performance and usage patterns--especially in data warehouses--designers will denormalize a database).

We want to eliminate this many-to-many relationship (also called a non-specific relationship). This is a common occurrance in databases. To remedy this we create a new table often called an intersection table:

OrderInvoice(Order No,Invoice No)

Now, the Order table has a one-to-many relationship to the OrderInvoice table and so does the Customer table. If we want to retrieve every Invoice for every Order for a particular customer, we can query the database for all orders in the Order table where Order No equals the Order No in the OrderInvoice table and where Invoice No in the OrderInvoice table equals the Invoice No in the Invoice table.

References

Communications of the ACM, Vol. 13, No. 6, June 1970, pp. 377-387.

http://www.acm.org/classics/nov95/toc.html