Data-oriented parsing

Data-oriented parsing (DOP, also data-oriented processing) is a probabilistic model in computational linguistics. DOP was conceived by Remko Scha in 1990 with the aim of developing a performance-oriented grammar framework. Unlike other probabilistic models, DOP takes into account all subtrees contained in a treebank rather than being restricted to, for example, 2-level subtrees (like PCFGs), thus allowing for more context-sensitive information^[1].

Several variants of DOP have been developed. The initial version developed by Rens Bod in 1992 was based on tree-substitution grammar,^[2] while more recently, DOP has been combined with lexical-functional grammar (LFG). The resulting DOP-LFG finds an application in machine translation. Other work on learning and parameter estimation for DOP has also found its way into machine translation.

References

^ R. Bod, R. Scha and K. Sima'an, Data-Oriented Parsing, CSLI Publications, 2003, pp.1-5.
^ R. Bod, A computational model of language performance: Data oriented parsing, in: COLING 1992 Volume 3: The 15th International Conference on Computational Linguistics, https://www.aclweb.org/anthology/C92-3126.pdf

External Links

Remko Scha Research on DOP
DOP Homepage
Khalil Sima'an: Learning DOP models from treebanks; Computational Complexity
Andy Way (1999). A hybrid architecture for robust MT using LFG-DOP. Journal of Experimental and Theoretical Artificial Intelligence 11(3):441–471.

This computational linguistics-related article is a stub. You can help Wikipedia by expanding it.

[1] R. Bod, R. Scha and K. Sima'an, Data-Oriented Parsing, CSLI Publications, 2003, pp.1-5.

[2] R. Bod, A computational model of language performance: Data oriented parsing, in: COLING 1992 Volume 3: The 15th International Conference on Computational Linguistics, https://www.aclweb.org/anthology/C92-3126.pdf

[1]

[2]