The growth function (also called: shatter coefficient) measures the richness of a set family. It is especially used in the context of statistical learning theory, where it measures the complexity of a hypothesis class.
The term 'growth function' was coined by Vapnik and Chervonenkis in their 1968 paper, where they also proved many of its properties.[1]
Definitions
Set-family definition
Let be a set family (a set of sets) and a set. Their intersection is defined as the following set-family:
The index of with respect to is the size of this intersection:
Obviously, if a set has elements then the index is at most .
The growth function measures the size of as a function of . Formally:
Hypothesis-class definition
Equivalently, let be a hypothesis-class (a set of binary functions) and a set with elements. The restriction of to is the set of binary functions on that can be derived from :[2]: 45
The growth function measures the size of as a function of :[2]: 49
Examples
1. The domain is the real line .
The set-family contains all the half-lines (rays) from a given number to positive infinity, i.e., all sets of the form for some .
For any set of real numbers, the intersection contains sets: the empty set, the set containing the largest element of , the set containing the two largest elements of , and so on. Therefore: .[1]: Ex.1 The same is true whether contains open half-lines, closed half-lines, or both.
2. The domain is the segment .
The set-family contains all the open sets.
For any set of real numbers, the intersection contains all possible subsets of . There are such subsets, so .
[1]: Ex.2
4. The domain is the real line . The set-family contains all the real intervals, i.e., all sets of the form for some . For any set of real numbers, the intersection contains all runs of between 0 and consecutive elements of . The number of such runs is , so .
Properties
The growth function has two trivial bounds.
1. For any finite :
since for every , the number of elements in is at most . Therefore, the growth function is mainly interesting when is infinite.
2. For any nonempty :
I.e, the growth function has an exponential upper-bound.
We say that a set-family shatters a set if their intersection contains all possible subsets of , i.e. .
If shatters of size , then , which is the upper bound.
3. The following is a property of the Index function:
If, for some set of size , and for some number , -
then, there exists a subset of size such that = .
This implies the following property of the Growth function.[1]: Th.1
For every family there are two cases:
The exponential case: identically.
The polynomial case: is majorized by , where is the smallest integer for which .
The VC dimension of is defined according to these two cases:
In the polynomial case, = the largest integer for which .
In the exponential case.
So if-and-only-if .
The growth function can be regarded as a refinement of the concept of VC dimension. The VC dimension only tells us whether is equal to or smaller than , while the growth function tells us exactly how changes as a function of .
4. Another connection between the growth function and the VC dimension is given by the Sauer–Shelah lemma:[2]: 49
If , then:
for all :
In particular,
for all :
so the growth function grows polynomially, rather then exponentially, with .
Applications in probability theory
Let be family of subsets of some universal set . Suppose we choose a set that contains elements of . For each element we calculate the relative frequency and compare it to the probability of . Then, the difference satisfies the following upper bound:[1]: Th.2
References
^ abcdef
Vapnik, V. N.; Chervonenkis, A. Ya. (1971). "On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities". Theory of Probability & Its Applications. 16 (2): 264. doi:10.1137/1116025.
The paper was first published in 1968 in Russian.
The first English translation, by B. Seckler, appeared in 1971.
The translation was reproduced in 2015:
Vapnik, V. N.; Chervonenkis, A. Ya. (2015). "On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities". Measures of Complexity. p. 11. doi:10.1007/978-3-319-21852-6_3. ISBN978-3-319-21851-9.
^ abcShalev-Shwartz, Shai; Ben-David, Shai (2014). Understanding Machine Learning – from Theory to Algorithms. Cambridge University Press. ISBN9781107057135.