Chase (algorithm)

The Chase is a simple fixpoint algorithm testing and enforcing implication of data dependencies in database systems. It plays important roles in database theory as well as in practice. It is used, directly or indirectly, on an everyday basis by people who design databases, and it is used in commercial systems to reason about the consistency and correctness of a data design. New applications of the chase in meta-data management and data exchange are still being discovered.

The Chase has its origins in two seminal papers, one by David Maier, Alberto O. Mendelzon, and Yehoshua Sagiv^[1] and the other by Alfred V. Aho, Catriel Beeri, and Jeffrey D. Ullman^[2].

Chase test is for testing whether the projection of a relation onto any decomposition can be recovered by rejoining. Let t be a tuple in $\pi _{S_{1}}(R)\bowtie \pi _{S_{2}}(R)\bowtie ...\bowtie \pi _{S_{k}}(R)$ where R is a relation and F is a set of functional dependencies (FD). If tuples in R is represented as t₁, ..., t_k, the join of the projections of each t_i should agree with t on $\pi _{S_{i}}(R)$ where i = 1, 2, ..., k. If t_i is not on $\pi _{S_{i}}(R)$ , the value is unknown.

Chase test can be done by drawing a tableau. Suppose R has attributes A, B, ... and components of t are a, b, .... For t_i use the same letter as t in the components that are in S_i but subscript the letter with i if the component is not in i. Then, t_i will agree with t if it is in S_i and will have a unique value otherwise.

Example

Suppose R(A, B, C, D) which are decomposed into relations with attributes S₁ = {A, D}, S₂ = {A, C} and S₃ = {B, C, D} and F = {A→B, B→C, CD→A} is given. The initial tableau for this decomposition is:

A	B	C	D
a	b₁	c₁	d
a	b₂	c	d₂
a₃	b	c	d

The first row represents S₁. The components for attributes A and D are unsubscripted and those for attributes B and C are subscripted with i = 1. The second and third rows are filled in the same manner with S₂ and S₃ respectively. The goal for this test is to use the given F to prove that t = (a, b, c, d) is really in R. To do so, the tableau can be chased by applying the FD’s in F to equate symbols in the tableau. Final tableau with a row that is the same as t implies that any tuple t in the join of the projections is actually a tuple of R. To perform the chase test, first decompose all FD’s in F so each FD has a single attribute. Then, F = {A→B, B→C, C→A, D→A}. When equating two symbols, if one of them is unsubscripted, make the other be the same so that the final tableau can have a row that is exactly the same as t = (a, b, c, d). Also, if both have their own subscript, change either to be the other. However, to avoid confusion, all of the occurrences should be changed. First, apply A→B to the tableau. The first row is (a, b₁, c₁, d) where a is unsubscripted and b₁ is subscripted with 1. Comparing the first row with the second one, change b₂ to b₁. Since the third row has a₃, b in the third row stays the same. The resulting tableau is:

A	B	C	D
a	b₁	c₁	d
a	b₁	c	d₂
a₃	b	c	d

Then consider B→C. Both first and second rows have b₁ and notice that the second row has an unsubscripted b. Therefore, the first row changes to (a, b₁, c, d). Then the resulting tableau is:

A	B	C	D
a	b₁	c	d
a	b₁	c	d₂
a₃	b	c	d

Now consider C→A. The first row has an unsubscripted c with an unsubscripted a. Hence, change a₃ in the third row to a. The resulting tableau is:

A	B	C	D
a	b₁	c	d
a	b₁	c	d₂
a	b	c	d

At this point, notice that the third row is (a, b, c, d) which is the same as t. Therefore, this is the final tableau for the chase test with given R and F. Hence, whenever R is projected onto S₁, S₂ and S₃ and rejoined, the result is in R. Particularly, the resulting tuple is the same as the tuple of R that is projected onto {B, C, D}.

References

^ David Maier, Alberto O. Mendelzon, and Yehoshua Sagiv: "Testing Implications of Data Dependencies". ACM Trans. Datab. Syst. 4(4):455-469, 1979.
^ Alfred V. Aho, Catriel Beeri, and Jeffrey D. Ullman: "The Theory of Joins in Relational Databases", ACM Trans. Datab. Syst. 4(3):297-314, 1979.

Serge Abiteboul, Richard B. Hull, Victor Vianu: Foundations of Databases. Addison-Wesley, 1995.
A. V. Aho, C. Beeri, and J. D. Ullman: The theory of joins in relational databases. ACM Transactions on Database Systems 4(3): 297-314, 1979.
J. D. Ullman: Principles of Database and Knowledge-Base Systems, Volume I. Computer Science Press, New York, 1988.
J. D. Ullman; J. Widom: A First Course in Database Systems (3rd ed.). pp.96-99. Pearson Prentice Hall, 2008.

This database-related article is a stub. You can help Wikipedia by expanding it.

[1] David Maier, Alberto O. Mendelzon, and Yehoshua Sagiv: "Testing Implications of Data Dependencies". ACM Trans. Datab. Syst. 4(4):455-469, 1979.

[2] Alfred V. Aho, Catriel Beeri, and Jeffrey D. Ullman: "The Theory of Joins in Relational Databases", ACM Trans. Datab. Syst. 4(3):297-314, 1979.

[1]

[2]