Constrained conditional model

A Constrained Conditional Model (CCM) is a machine learning and inference framework that refers to augmenting the learning of conditional (probabilistic or discriminative) models with declarative constraints (written, for example, using a first-order representation) as a way to support decisions in an expressive output space while maintaining modularity and tractability of training and inference.

Models of this kind have recently attracted much attention within the NLP community. Formulating problems as constrained optimization problems over the output of learned models has several advantages. It allows one to focus on the modeling of problems by providing the opportunity to incorporate problem specific global constraints using a first order language thus frees the developer from (much of the) low level feature engineering and it guarantees exact inference. It provides also the freedom of decoupling the stage of model generation (learning) from that of the constrained inference stage, often resulting in simplifying the learning stage as well as the engineering problem of building an NLP system, while improving the quality of the solutions.

Motivation

Making decisions in many learning domains (such as natural language processing and computer vision problems) often involve assigning values to sets of interdependent variables where the expressive dependency structure can influence, or even dictate, what assignments are possible. These settings are applicable to Structured Learning problems such as semantic role labeling but also for cases that require making use of multiple pre-learned components, such as summarization, textual entailment and question answering. In all these cases, it is natural to formulate the decision problem as a constrained optimization problem, with an objective function that is composed of learned models, subject to domain or problem specific constraints.

Constrained Conditional Models (aka Integer Linear Programming formulation of NLP problems) is a learning and inference framework that augments the learning of conditional (probabilistic or discriminative) models with declarative constraints (written, for example, using a first-order representation) as a way to support decisions in an expressive output space while maintaining modularity and tractability of training and inference. In most applications of this framework in NLP, following ?, Integer Linear Programming (ILP) was used as the inference framework, although other algorithms can be used for that purpose.

Training Paradigms

Learning Local VS. Global Models The objective function used by CCMs can be decomposed and learned in several ways, ranging from a complete joint training of the model along with the constraints to a complete decoupling between the learning and the inference stage. The advantages of each approach are discussed in ?, which studies the two training paradigms: (1) local models: L+I (learning+inference) and (2) global model: IBT (Inference based training), and shows both theoretically and experimentally that 1 while IBT (joint training) is best in the limit, under some conditions (basically, ”good” components”) L+I generalizes better.

Constraint Driven Learning Using domain knowledge expressed as constraints to drive learning and help reduce supervision was studied in ? and ?. These works introduce Constraints Driven Learning (CODL), focusing on semi-supervised learning and show that by incorporating domain knowledge the performance of the learned model improves significantly.

Learning over Latent Representations CCMs were also applied to latent learning frameworks, where the learning problem is defined over a latent layer. Identifying the correct, or optimal learning representation is viewed as a structured prediction process modeled as a CCM. This problem was studied by several papers, in both supervised ? and unsupervised ? settings and in all cases showed that explicitly modeling the interdependencies between representation decisions via constraints results in an improved performance. Furthermore, ? presents a training paradigm for latent representation CCM based on the final output performance.

ILP4NLP

The advantages of the CCM declarative formulation and the availability of off-the-shelf solvers have led to a large variety of natural language processing tasks being formulated within framework, including semantic role labeling, syntactic parsing, coreference resolution, summarization, transliteration and joint information extraction.

Academic Links

Dan Roth and Wen-tau Yih, "A Linear Programming Formulation for Global Inference in Natural Language Tasks." CoNLL, (2004).

External links

University of Illinois Cognitive Computation Group