Jump to content

Learning curve (machine learning)

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Justin Ormont (talk | contribs) at 09:39, 15 February 2019 (Created as stub w/ content from Learning_curve#In_machine_learning and https://scikit-learn.org/stable/modules/learning_curve.html#learning-curve (BSD licensed)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)


Learning curve showing training score and cross validation score

A learning curve shows the validation and training score of an estimator for varying numbers of training samples. It is a tool to find out how much a machine learning model benefits from adding more training data and whether the estimator suffers more from a variance error or a bias error. If both the validation score and the training score converge to a value that is too low with increasing size of the training set, it will not benefit much from more training data. [1]

The machine learning curve is useful for many purposes including comparing different algorithms,[2] choosing model parameters during design,[3] adjusting optimization to improve convergence, and determining the amount of data used for training.[4]

In the machine learning domain, there are two connotations of learning curves differing in the x-axis of the curves, with experience of the model graphed either as the the number of training examples used for learning or the number of iterations used in training the model.[5]

References

  1. ^ scikit-learn developers. "Validation curves: plotting scores to evaluate models — scikit-learn 0.20.2 documentation". Retrieved February 15, 2019.
  2. ^ Madhavan, P.G. (1997). "A New Recurrent Neural Network Learning Algorithm for Time Series Prediction" (PDF). Journal of Intelligent Systems. p. 113 Fig. 3.
  3. ^ "Machine Learning 102: Practical Advice". Tutorial: Machine Learning for Astronomy with Scikit-learn.
  4. ^ Meek, Christopher; Thiesson, Bo; Heckerman, David (Summer 2002). "The Learning-Curve Sampling Method Applied to Model-Based Clustering". Journal of Machine Learning Research. 2 (3): 397.
  5. ^ Sammut, Claude; Webb, Geoffrey I. (Eds.). Encyclopedia of Machine Learning (1st ed.). Springer. p. 578. ISBN 978-0-387-30768-8.