User:Felix QW/Inductive logic programming

Setting

Inductive logic programming has adopted several different learning settings, the most common of which are learning from entailment and learning from interpretations.^[1] In both cases, the input is provided in the form of background knowledge $B$ , a logical theory (commonly in the form of clauses used in logic programming), as well as positive and negative examples, denoted ${\textstyle E^{+}}$ and ${\textstyle E^{-}}$ respectively. The output is given as a hypothesis $H$ , itself a logical theory that typically consists of one or more clauses.

The two settings differ in the format of examples presented.

Learning from entailment

As of 2022^[update], learning from entailment is by far the most popular setting for inductive logic programming.^[1] In this setting, the positive and negative examples are given as finite sets ${\textstyle E^{+}}$ and ${\textstyle E^{-}}$ of positive and negated ground literals, respectively. A correct hypothesis $H$ is a set of clauses satisfying the following requirements, where the turnstile symbol $\models$ stands for logical entailment:^[1]^[2]^[3] ${\begin{array}{llll}{\text{Completeness:}}&B\cup H&\models &E^{+}\\{\text{Consistency: }}&B\cup H\cup E^{-}&\not \models &{\textit {false}}\end{array}}$ Completeness requires any generated hypothesis $h$ to explain all positive examples ${\textstyle E^{+}}$ , and consistency forbids generation of any hypothesis $h$ that is inconsistent with the negative examples ${\textstyle E^{-}}$ , both given the background knowledge $B$ .

In Muggleton's setting of concept learning,^[4] "completeness" is referred to as "sufficiency", and "consistency" as "strong consistency". Two further conditions are added: "Necessity", which postulates that $B$ does not entail ${\textstyle E^{+}}$ , does not impose a restriction on $h$ , but forbids any generation of a hypothesis as long as the positive facts are explainable without it. . "Weak consistency", which states that no contradiction can be derived from ${\textstyle B\land H}$ , forbids generation of any hypothesis $h$ that contradicts the background knowledge $B$ . Weak consistency is implied by strong consistency; if no negative examples are given, both requirements coincide. Weak consistency is particularly important in the case of noisy data, where completeness and strong consistency cannot be guaranteed.^[4]

Learning from interpretations

In learning from interpretations, the positive and negative examples are given as a set of complete or partial Herbrand structures, each of which are themselves a finite set of ground literals. Such a structure $e$ is said to be a model of the set of clauses ${\textstyle B\cup H}$ if for any substitution ${\textstyle \theta }$ and any clause ${\textstyle \mathrm {head} \leftarrow \mathrm {body} }$ in ${\textstyle B\cup H}$ such that ${\textstyle \mathrm {body} \theta \subseteq e}$ , $\mathrm {head} \theta \subseteq e$ also holds. The goal is then to output a hypothesis that is complete, meaning every positive example is a model of ${\textstyle B\cup H}$ , and consistent, meaning that no negative example is a model of ${\textstyle B\cup H}$ .^[1]

Approaches to ILP

Bottom-Up search

Sketches of Golem and the Model Inference System
Inverse resolution
Least general generalisations

Top-down search

Sketch of Progol/Aleph
Beam search

Metainterpretative learning

Metarules
Constraint-based learning

Common issues

Language bias
Scoring functions
Predicate invention
Learning recursive programs

^ ^a ^b ^c ^d Cropper, Andrew; Dumančić, Sebastijan (2022-06-15). "Inductive Logic Programming At 30: A New Introduction". Journal of Artificial Intelligence Research. 74: 779–782. doi:10.1613/jair.1.13507. ISSN 1076-9757.
^ Džeroski, Sašo (1996). "Inductive Logic Programming and Knowledge Discovery in Databases" (PDF). In Fayyad, U.M.; Piatetsky-Shapiro, G.; Smith, P.; Uthurusamy, R. (eds.). Advances in Knowledge Discovery and Data Mining. MIT Press. pp. 117–152 See §5.2.4. Archived from the original (PDF) on 2021-09-27. Retrieved 2021-09-27.
^ De Raedt, Luc (1997). "Logical settings for concept-learning". Artificial Intelligence. 95 (1): 187–201. doi:10.1016/S0004-3702(97)00041-6.
^ ^a ^b Muggleton, Stephen (1999). "Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic". Artificial Intelligence. 114 (1–2): 283–296. doi:10.1016/s0004-3702(99)00067-3.; here: Sect.2.1

[:3-1] Cropper, Andrew; Dumančić, Sebastijan (2022-06-15). "Inductive Logic Programming At 30: A New Introduction". Journal of Artificial Intelligence Research. 74: 779–782. doi:10.1613/jair.1.13507. ISSN 1076-9757.

[2] Džeroski, Sašo (1996). "Inductive Logic Programming and Knowledge Discovery in Databases" (PDF). In Fayyad, U.M.; Piatetsky-Shapiro, G.; Smith, P.; Uthurusamy, R. (eds.). Advances in Knowledge Discovery and Data Mining. MIT Press. pp. 117–152 See §5.2.4. Archived from the original (PDF) on 2021-09-27. Retrieved 2021-09-27.

[3] De Raedt, Luc (1997). "Logical settings for concept-learning". Artificial Intelligence. 95 (1): 187–201. doi:10.1016/S0004-3702(97)00041-6.

[:0-4] Muggleton, Stephen (1999). "Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic". Artificial Intelligence. 114 (1–2): 283–296. doi:10.1016/s0004-3702(99)00067-3.; here: Sect.2.1

[1]

[2]

[3]

[4]