Jump to content

Occam learning

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Lynctekrua (talk | contribs) at 16:19, 25 January 2015 (Lynctekrua moved page Draft:Occam Learning to Occam Learning: Publishing accepted Articles for creation submission (AFCH 0.9)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In Occam Learning, named after Occam's Razor, a PAC learning algorithm is evaluated based on its succinctness and performance on the training set, rather than directly on its predictive power on a test set. Occam learnability is equivalent to PAC learnability.

Definitions of Occam Learning and Succinctness

The succinctness of a concept in concept class can be expressed by the length of the shortest bit string that can represent in . Occam learning connects the succinctness of a learning algorithm's output to its predictive power on unseen data.

Let and be concept classes containing target concepts and hypotheses respectively and let sample set contain samples each containing bits. Then, for constants and , a learning algorithm is an (α,β)-Occam algorithm for using if, given labeled according to , outputs a hypothesis such that is consistent with on (that is, ) and .[1][2]

Such an algorithm is called an efficient (α,β)-Occam algorithm if it runs in time polynomial in , , and .

Equivalence of Occam and PAC learning

Any efficient Occam algorithm is also an efficient PAC learning algorithm. Specifically, an efficient (α,β)-Occam algorithm for using , given samples of bits each, such that will return such that with probability at least . More generally, there exists a constant such that given examples will output such that with probability at least .[1]

Any PAC learning algorithm is also an Occam algorithm. [3]

Improving Sample Complexity for Common Problems

Though Occam and PAC learnability are equivalent, the Occam framework can be used to produce tighter bounds on the sample complexity of classical problems including conjunctions[1], conjunctions with few relevant variables[4], and decision lists[5].

Extensions

Occam algorithms have also been shown to be successful for PAC learning in the presence of errors[6][7], probabilistic concepts [8], function learning [9], and Markovian non-independent examples[10].

See Also

References

Category:Computational learning theory Category:Theoretical computer science Category:Machine learning

  1. ^ a b c Kearns, M. J., & Vazirani, U. V. (1994). An introduction to computational learning theory, chapter 2. MIT press.
  2. ^ Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K. (1987). Occam's razor. Information processing letters, 24(6), 377-380.
  3. ^ Board, R., & Pitt, L. (1990, April). On the necessity of Occam algorithms. In Proceedings of the twenty-second annual ACM symposium on Theory of computing (pp. 54-63). ACM.
  4. ^ Haussler, D. (1988). Quantifying inductive bias: AI learning algorithms and Valiant's learning framework. Artificial intelligence, 36(2), 177-221.
  5. ^ Rivest, R. L. (1987). Learning decision lists. Machine learning, 2(3), 229-246.
  6. ^ Angluin, D., & Laird, P. (1988). Learning from noisy examples. Machine Learning, 2(4), 343-370.
  7. ^ Kearns, M., & Li, M. (1993). Learning in the presence of malicious errors. SIAM Journal on Computing, 22(4), 807-837.
  8. ^ Kearns, M. J., & Schapire, R. E. (1990, October). Efficient distribution-free learning of probabilistic concepts. In Foundations of Computer Science, 1990. Proceedings., 31st Annual Symposium on (pp. 382-391). IEEE.
  9. ^ Natarajan, B. K. (1993, August). Occam's razor for functions. In Proceedings of the sixth annual conference on Computational learning theory (pp. 370-376). ACM.
  10. ^ Aldous, D., & Vazirani, U. (1990, October). A Markovian extension of Valiant's learning model. In Foundations of Computer Science, 1990. Proceedings., 31st Annual Symposium on (pp. 392-396). IEEE.