Coupled pattern learner

Coupled Pattern Learner (CPL) is a machine learning algorithm which couples the semi-supervised learning of categories and relations to forestall the problem of semantic drift associated with boot-strap learning methods.

Coupled Pattern Learner

Semi-supervised learning approaches using a small number of labeled examples with many unlabeled examples are usually unreliable as they produce an internally consistent, but incorrect set of extractions. CPL solves this problem by simultaneously learning classiﬁers for many different categories and relations in the presence of an ontology deﬁning constraints that couple the training of these classiﬁers. It was introduced by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell in 2009. ^[1] ^[2]

CPL Summary

CPL is an approach to semi-supervised learning that yields more accurate results by coupling the training of many information extractors. Basic idea behind CPL is that semi-supervised training of a single type of extractor such as ‘coach’ is much more difﬁcult than simultaneously training many extractors that cover a variety of inter-related entity and relation types. Using prior knowledge about the relationships between these different entities and relations CPL makes unlabeled data as a useful constraint during training. For e.g., ‘coach(x)’ implies ‘person(x)’ and ‘not sport(x)’.

CPL Description

Coupling of Predicates

CPL primarily relies on the notion of coupling the learning of multiple functions so as to constrain the semi-supervised learning problem. CPL constrains the learned function in two ways.

Sharing among same-arity predicates according to logical relations
Relation argument type-checking

Sharing among same-arity predicates

Each predicate P in the ontology has a list of other same-arity predicates with which P is mutually exclusive. If A is mutually exclusive with predicate B, A’s positive instances and patterns become negative instances and negative patterns for B. For example, if ‘city’, having an instance ‘Boston’ and a pattern ‘mayor of arg1’, is mutually exclusive with ‘scientist’, then ‘Boston’ and ‘mayor of arg1’ will become a negative instance and a negative pattern respectively for ‘scientist.’ Further, Some categories are declared to be a subset of another category. For e.g., ‘athlete’ is a subset of ‘person’.

Relation argument type-checking

This is a type checking information used to couple the learning of relations and categories. For example, the arguments of the ‘ceoOf’ relation are declared to be of the categories ‘person’ and ‘company’. CPL does not promote a pair of noun phrases as an instance of a relation unless the two noun phrases are classiﬁed as belonging to the correct argument types.

Algorithm Description

Following is a quick summary of the CPL algorithm.^[2]

Input: An ontology O, and a text corpus C 
Output: Trusted instances/patterns for each predicate
for i=1,2,...,∞ do
 foreach predicate p in O do
  EXTRACT candidate instances/contextual patterns using recently promoted patterns/instances;
  FILTER candidates that violate coupling;
  RANK candidate instances/patterns;
  PROMOTE top candidates;
 end
end

Inputs

A large corpus of Part-Of-Speech tagged sentences and an initial ontology with predeﬁned categories, relations, mutually exclusive relationships between same-arity predicates, subset relationships between some categories, seed instances for all predicates, and seed patterns for the categories.

Candidate extraction

CPL ﬁnds new candidate instances by using newly promoted patterns to extract the noun phrases that co-occur with those patterns in the text corpus. CPL extracts,

Category Instances
Category Patterns
Relation Instances
Relation Patterns

Candidate Filtering

Candidate instances and patterns are ﬁltered to maintain high precision, and to avoid extremely speciﬁc patterns. An instance is only considered for assessment if it co-occurs with at least two promoted patterns in the text corpus, and if its co-occurrence count with all promoted patterns is at least three times greater than its co-occurrence count with negative patterns.

Candidate Ranking

CPL ranks candidate instances using the number of promoted patterns that they co-occur with so that candidates that occur with more patterns are ranked higher. Patterns are ranked using an estimate of the precision of each pattern.

Candidate Promotion

CPL ranks the candidates according to their assessment scores and promotes at most 100 instances and 5 patterns for each predicate. Instances and patterns are are only promoted if they co-occur with at least two promoted patterns or instances, respectively.

Applications

In their paper ^[1] authors have presented results showing the potential of CPL to contribute new facts to existing repository of semantic knowledge, Freebase ^[3]

References

^ ^a ^b Carlson, Andrew (2009). "Coupling semi-supervised learning of categories and relations". Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing. Colorado, USA: Association for Computational Linguistics: 1–9. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ ^a ^b Carlson, Andrew (2010). "Coupled semi-supervised learning for information extraction". Proceedings of the third ACM international conference on Web search and data mining. NY, USA: ACM: 101–110. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ "Freebase data dumps". Metaweb Technologies. 2009. {{cite journal}}: Cite journal requires |journal= (help)

Liu, Qiuhua (2008). "Semi-supervised multitask learning". NIPS. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

Shinyama, Yusuke (2006). "Preemptive information extraction using unrestricted relation discovery". HLT-NAACL. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

Chang, Ming-Wei (2007). "Guiding semi-supervision with constraint driven learning". ACL. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

Banko, Michele (2007). "Open information extraction from the web". IJCAI. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

Blum, Avrim (1998). "Combining labeled and unlabeled data with co-training". COLT. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[cbl2009-1] Carlson, Andrew (2009). "Coupling semi-supervised learning of categories and relations". Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing. Colorado, USA: Association for Computational Linguistics: 1–9. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[cpl2010-2] Carlson, Andrew (2010). "Coupled semi-supervised learning for information extraction". Proceedings of the third ACM international conference on Web search and data mining. NY, USA: ACM: 101–110. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[3] "Freebase data dumps". Metaweb Technologies. 2009. {{cite journal}}: Cite journal requires |journal= (help)

[1]

[2]

[3]