Jump to content

Meta-learning (computer science)

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by JoaquinVanschoren (talk | contribs) at 17:25, 3 April 2006 (explains the meaning of meta learning in the context of machine learning, computer science). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Meta learning, is a subfield of Machine learning where automatic learning algorithms are applied on meta-data about machine learning experiments. Although different researchers hold different views as to what the term exactly means (see below), the main goal is to use such meta-data to understand how automatic learning can become flexible in solving different kinds of learning problems, hence to improve the performance of existing learning algorithms.

Flexibility is important because each learning algorithm is based on a set of assumptions about the data, it's inductive bias. This means that it will only learn well if the bias matches the data in the learning problem. By using different kinds of meta-data (properties of the learning problem, algorithm properties or performance measures, hypotheses previously derived from the data,...), it is possible to combine, select or alter different learning algorithms to perform better on a given learning problem.

Different views on meta learning

These are some of the views on (and approaches to) meta learning, please note that there exist many variations on these general approaches:

  • stacked generalisation --- works by combining different learning algorithms. The meta-data is formed by the predictions of the different algorithms and the right answers provided by the training data. Then another learning algorithm learns from this meta-data to predict which combinations of algorithms give generally good results. Given a new learning problem, the selected set of algorithms are used and their predictions are combined (e.g. by majority vote) to provide the final prediction. Since each algorithm is deemed to work on a subset of problems, a combination is hoped to be more flexible and still (by majority) able to make good predictions.
  • discovering meta-knowledge --- works by inducing knowledge (e.g. rules) that expresses how each learning method will perform on different learning problems. The meta-data is formed by characteristics of the data (general, statistical, information-theoretic,... ) in the learning problem, and characteristics of the learning algorithm (type, settings, performance measures,...). Another learning algorithm then learns how the data characteristics relate to the algorithm characteristics. Given a new learning problem, the data characteristics are measured, and the performance of different learning algorithms can be predicted. Hence, one can select the algorithms best suited for the new problem, at least if the induced relationship holds.
  • dynamic bias selection --- works by altering the inductive bias of a learning algorithm to match the given problem. This is done by altering key aspects of the learning algorithm, such as the hypothesis representation, heuristic formulae, or parameters. Many different approaches exist.
  • Inductive transfer --- also called learning to learn, studies how the learning process can be improved over time. Meta-data consists of knowledge about previous learning episodes, and is used to efficiently develop an effective hypothesis for a new task. A related approach is called [[learning to learn], in which the goal is to use acquired knowledge from one domain to help learning in other domains.