Active learning (machine learning)

This article is about a machine learning method. For active learning in the context of education, see Active learning

Active Learning is a form of supervised machine learning in which the learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points.

There are situations in which unlabeled data is abundant but labeling data is expensive. In such a scenario the learning algorithm can actively query the user/teacher for labels. This type of iterative supervised learning is called active learning. Since the learner chooses the examples, the number of examples to learn a concept can often be much lower than the number required in normal supervised learning. With this approach there is a risk that the algorithm might focus on unimportant or even invalid examples.

Active learning can be especially useful in biological research problems such as Protein engineering where a few proteins have been discovered with a certain interesting function and one wishes to determine which of many possible mutants to make next that will have a similar function^[1].

Definitions

Let $T$ be the total set of all data under consideration. For example, in a protein engineering problem, $T$ would include all proteins that are known to have a certain interesting activity and all additional proteins that one might want to test for that activity.

During each iteration, $i$ , $T$ is broken up into three subsets

$\mathbf {T} _{K,i}$ : Data points where the label is known.
$\mathbf {T} _{U,i}$ : Data points where the label is unknown.
$\mathbf {T} _{C,i}$ : A subset of $T_{U,i}$ that is chosen to be labeled.

Most of the current research in active learning involves the best method to chose the data points for $T_{C,i}$ .

Minimum Marginal Hyperplane

Some active learning algorithms are built upon Support vector machines (SVMs) and exploit the structure of the SVM to determine which data points to label. Such methods usually calculate the margin, $W$ , of each unlabeled datum in $T_{U,i}$ and treat $W$ as an n-dimensional distance from that datum to separating hyperplane.

Minimum Marginal Hyperplane methods assume that the data with the smallest $W$ are those that the SVM is most uncertain about and therefore should be placed in $T_{C,i}$ to be labeled. Other similar methods, such as Maximum Marginal Hyperplane, choose data with the largest $W$ . Tradeoff methods choose a mix of the smallest and largest $W$ s.

Maximum Curiosity

Another active learning method, that typically learns a data set with fewer examples than Minimum Marginal Hyperplane but is more computationally intensive and only works for discrete classifiers is Maximum Curiosity^[2].

Maximum curiosity takes each unlabeled datum in $T_{U,i}$ and assumes all possible labels that datum might have. This datum with each assumed class is added to $T_{K,i}$ and then the new $T_{K,i}$ is cross-validated. It is assumed that when the datum is paired up with its correct label, the cross-validated accuracy (or correlation coefficient) of $T_{K,i}$ will most improve. The datum with the most improved accuracy is placed in $T_{C,i}$ to be labeled

^ Danziger, S.A., Swamidass, S.J., Zeng, J., Dearth, L.R., Lu, Q., Chen, J.H., Cheng, J., Hoang, V.P., Saigo, H., Luo, R., Baldi, P., Brachmann, R.K. and Lathrop, R.H. Functional census of mutation sequence spaces: the example of p53 cancer rescue mutants, (2006) IEEE/ACM transactions on computational biology and bioinformatics, 3, 114-125.
^ Danziger, S.A., Zeng, J., Wang, Y., Brachmann, R.K. and Lathrop, R.H. Choosing where to look next in a mutation sequence space: Active Learning of informative p53 cancer rescue mutants,(2007) Bioinformatics, 23(13), 104-114.[1]

[Danziger2006-1] Danziger, S.A., Swamidass, S.J., Zeng, J., Dearth, L.R., Lu, Q., Chen, J.H., Cheng, J., Hoang, V.P., Saigo, H., Luo, R., Baldi, P., Brachmann, R.K. and Lathrop, R.H. Functional census of mutation sequence spaces: the example of p53 cancer rescue mutants, (2006) IEEE/ACM transactions on computational biology and bioinformatics, 3, 114-125.

[Danziger2007-2] Danziger, S.A., Zeng, J., Wang, Y., Brachmann, R.K. and Lathrop, R.H. Choosing where to look next in a mutation sequence space: Active Learning of informative p53 cancer rescue mutants,(2007) Bioinformatics, 23(13), 104-114.[1]

[1]

[2]