Jump to content

Language and Communication Technologies

From Wikipedia, the free encyclopedia
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Language and Communication Technologies (LCT; also known as human language technologies or language technology for short) is the scientific study of technologies that explore language and communication. It is an interdisciplinary field that encompasses the fields of computer science, linguistics and cognitive science.

History

One of the first problems to be studied in the 1950s, shortly after the invention of computers, was an LCT problem, namely the translation of human languages. The large amounts of funding poured into machine translation testifies to the perceived importance of the field, right from the beginning. It was also in this period that scholars started to develop theories of language and communication based on scientific methods. In the case of language, it was Noam Chomsky who refines the goal of linguistics as a quest for a formal description of language,[1] whilst Claude Shannon and Warren Weaver provided a mathematical theory that linked communication with information.[2]

Computers and related technologies have provided a physical and conceptual framework within which scientific studies concerning the notion of communication within a computational framework could be pursued. Indeed, this framework has been fruitful on a number of levels. For a start, it has given birth to a new discipline, known as natural language processing (NLP), or computational linguistics (CL). This discipline studies, from a computational perspective, all levels of language from the production of speech to the meanings of texts and dialogues. And over the past 40 years, NLP has produced an impressive computational infrastructure of resources, techniques, and tools for analyzing sound structure (phonology), word structure (morphology), grammatical structure (syntax) and meaning structure (semantics). As well as being important for language-based applications, this computational infrastructure makes it possible to investigate the structure of human language and communication at a deeper scientific level than was ever previously possible.

Moreover, NLP fits in naturally with other branches of computer science, and in particular, with artificial intelligence (AI).[3] From an AI perspective, language use is regarded as a manifestation of intelligent behaviour by an active agent. The emphasis in AI-based approaches to language and communication is on the computational infrastructure required to integrate linguistic performance into a general theory of intelligent agents that includes, for example, learning generalizations on the basis of particular experience, the ability to plan and reason about intentionally produced utterances, the design of utterances that will fulfill a particular set of goals. Such work tends to be highly interdisciplinary in nature, as it needs to draw on ideas from such fields as linguistics, cognitive psychology, and sociology. LCT draws on and incorporates knowledge and research from all these fields.

Today

Language and communication are so fundamental to human activity that it is not at all surprising to find that Language and Communication Technologies affect all major areas of society, including health, education, finance, commerce, and travel. Modern LCT is based on a dual tradition of symbols and statistics. This means that nowadays research on language requires access to large databases of information about words and their properties, to large scale computational grammars, to computational tools for working with all levels of language, and to efficient inference systems for performing reasoning. By working computationally it is possible to get to grips with the deeper structure of natural languages, and in particular, to model the crucial interactions between the various levels of language and other cognitive faculties.

Relevant areas of research in LCT include:

Educational programs

The increasing interest in the field is proved by the existence of several European Masters in this dynamic research area:[4] Degree programmes of the University of Groningen include Language and Communication Technologies.

Erasmus Mundus Masters:

  • European Masters Program in Language and Communication Technologies https://lct-master.org/
  • International Masters in NLP and HLT
  • European Master in Clinical Linguistics

Language Models

A language model is a model of the human brain's ability to reproduce natural language.[5][6] Language models are useful for solving various tasks, including speech recognition, machine translation, natural language generation (creating text that is more human-like), optical character recognition, route optimization,[7] handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), which are currently the most advanced form, are predominantly based on transformers trained on large datasets (often using texts taken from the publicly available internet). They have supplanted models based on recurrent neural networks, which previously replaced purely statistical models such as word n-gram language models.

The largest and most efficient training models are generative pretrained transformers (GPT), which are widely used in generative chatbots such as ChatGPT, Gemini, or Claude. Training models can be fine-tuned to perform specific tasks or use prompt engineering methods. These models acquire predictive power regarding syntax, semantics, and ontologies[8] inherent to corpora of human language, but they also inherit inaccuracies and biases present in the data on which they are trained.[9] To help large language models (LLMs) like ChatGPT, Claude, and Google Gemini better understand website content, the llms.txt file offers a structured approach to make the site more accessible to artificial intelligence systems.[10]

Before fine-tuning, most LLMs are next-token predictors. Fine-tuning can allow LLMs to adopt a conversational format where they play the role of an assistant.[11][12] Methods such as reinforcement learning from human feedback (RLHF) or constitutional AI can be used to embed human preferences and make LLMs more “useful, honest, and harmless.”

References

  1. ^ Noam Chomsky, Syntactic Structures, London: Mouton, 1957.
  2. ^ Shannon, CE (1948). "A Mathematical Theory of Communication". 9p.io. Archived from the original on 1998-01-31. Retrieved 2011-03-21.
  3. ^ AITopics / NaturalLanguage. Aaai.org. Retrieved on 2011-03-21. Archived 2008-07-31 at the Wayback Machine
  4. ^ Erasmus Mundus – Action 1 – Erasmus Mundus Masters Courses (EMMCs) | EACEA. Eacea.ec.europa.eu. Retrieved on 2011-03-21.
  5. ^ "A New Study Says AI Models Encode Language Like the Human Brain Does". singularityhub.com. Retrieved 2025-07-21.
  6. ^ "What is a large language model (LLM)?". sap.com. Retrieved 2025-07-21.
  7. ^ "Can language models be used for real-world urban-delivery route optimization?". pmc.ncbi.nlm.nih.gov. Retrieved 2025-07-21.
  8. ^ "NeOn-GPT: A Large Language Model-Powered Pipeline for Ontology Learning⋆" (PDF). 2024.eswc-conferences.org. Retrieved 2025-07-21.
  9. ^ "Human Language Understanding & Reasoning". www.amacad.org. Retrieved 2025-07-21.
  10. ^ "What Is LLMS.TXT? Ultimate AIO Guide". coinbound.io. Retrieved 2025-07-21.
  11. ^ "Fine-Tuning LLMs for Multi-Turn Conversations: A Technical Deep Dive". www.together.ai. Retrieved 2025-07-21.
  12. ^ "What is fine-tuning? A guide to fine-tuning LLMs". cohere.com. Retrieved 2025-07-21.