Text processing
![]() | This article does not contain any links to other Wikipedia articles. (May 2013) |

In computing, the term text processing refers to the discipline of mechanizing the creation or manipulation of electronic text. Text usually refers to all the alphanumeric characters specified on the keyboard of the person performing the mechanization, but in general text here means the abstraction layer that is one layer above the standard character encoding of the target text. The term processing refers to automated (or mechanized) processing, as opposed to the same manipulation done manually, where in either case the intention of the editor is impressed directly upon a set of textual characters.
Text processing involves computer commands which invoke content, content changes, and cursor movement, for example to
- search and replace
- generally format a text file
- generate a report of a text file
- filter a file or report
Text processing is a virtual editing machine, having a primitive programming language that has named registers (identifiers), and named positions in the sequence of characters comprising the text. Using these the "text processor" can, for example, mark a region of text, and then move it.
Since the character encoding, font, and colorization of text processing are invisible, these comprise the transitory properties of some formal definition as distinguished from word processing. The definite distinctions with word processing are that text processing
- represents "text editing" programs as well as "text processing utilities".
- is (ultimately) "the keyboard way" as opposed to "the mouse way" (e.g. drag and drop, cut and paste) of initiating an edit.
- does not operate at the application layer such as rendering a file composed of markup.
In this way font and color are not necessarily a distinguishing factor, but text processing is defined more simply than the process of manipulating the invisible characters that affect those.
The development of computer text processing started in earnest with Kleene's formalizing what is a regular language. Such regular expressions could then became a mini-program, complete with a compilation process, available to perform any edit, once the language was extended. Even with the extensions, the primary, invariant characteristic of text processing involves the concept of a sequential access to characters as opposed to a random access. An editor invokes an input stream or "string" composed of a line, or one or more files, whose output is then also applicable to text processing filters, each of which is comparable to a step in an algorithm that could be one computer program instead of several regular expressions.
External links
The subject matter of the book Automatic Text Processing by Gerard Salton