Multiple-instance learning
Multiple-instance learning (MIL) is a variation on supervised learning. Instead of receiving a set of instances which are labeled positive or negative, the learner receives a set of bags that are labeled positive or negative. Each bag contains many instances. The most common assumption is that a bag is labeled negative if all the instances in it are negative. On the other hand, a bag is labeled positive if there is at least one instance in it which is positive. From a collection of labeled bags, the learner tries to either (i) induce a concept that will label individual instances correctly or (ii) learn how to label bags without inducing the concept.
Multiple-instance learning was originally proposed under this name by Dietterich, Lathrop & Lozano-Pérez (1997), but earlier examples of similar research exist, for instance in the work on handwritten digit recognition by Keeler, Rumelhart & Leow (1990). Recent reviews of the MIL literature include Amores (2013), which provides an extensive review and comparative study of the different paradigms, and Foulds & Frank (2010), which provides a thorough review of the different assumptions used by different paradigms in the literature.
Examples of where MIL is applied are:
- Molecule activity
- Image classification Maron & Ratan (1998)
- Text or document categorization
Numerous researchers have worked on adapting classical classification techniques, such as support vector machines or boosting, to work within the context of multiple-instance learning.
References
- Dietterich, Thomas G.; Lathrop, Richard H.; Lozano-Pérez, Tomás (1997), "Solving the multiple instance problem with axis-parallel rectangles", Artificial Intelligence, 89 (1–2): 31–71, doi:10.1016/S0004-3702(96)00034-3.
- Amores, Jaume (2013), "Multiple instance classification: Review, taxonomy and comparative study", Artificial Intelligence, 201: 81–105, doi:10.1016/j.artint.2013.06.003.
- Foulds, James; Frank, Eibe (2010), "A Review of Multi-Instance Learning Assumptions", Knowledge Engineering Review, 25 (1): 1–25, doi:10.1017/S026988890999035X.
- Keeler, James D.; Rumelhart, David E.; Leow, Wee-Kheng (1990), "Integrated segmentation and recognition of hand-printed numerals", Proceedings of the 1990 Conference on Advances in Neural Information Processing Systems (NIPS 3), pp. 557–563.
- Maron, O.; Ratan, A.L. (1998), "Multiple-instance learning for natural scene classification", Proceedings of the Fifteenth International Conference on Machine Learning, pp. 341–349.