Pipeline (Unix)

In UNIX and other UNIX-like operating systems, a pipeline' is a command line is a set of filter processes chained by their standard I/O streams, so that the output of each process is automatically fed to the next one. This feature of Unix became the pipes and filters design pattern of software engineering.

History

Douglas McIlroy, one of the authors of the early UNIX command shells, noticed that much of the time they were processing the output of one program as the input to another. The UNIX pioneers established a means of chaining the running programs together as co-processes so that the output of the first program becomes the input to the second.

Example

Below is an example of a pipeline that implements a kind of spell checker for the web resource indicated by a URL [1].

curl http://www.wikipedia.org/wiki/Pipeline |
sed 's/[^a-zA-Z ]//g' |
tr 'A-Z ' 'a-z\n' |
grep '[a-z]' |
sort -u |
comm -23 - /usr/dict/words

Here is an explanation of the pipeline:

First the curl program obtains the HTML contents of a web page.
The contents of this page are piped through sed, which removes all characters which are not spaces or letters.
tr then changes all of the uppercase letters into their corresponding lowercase counterparts, and converts the spaces in the lines of text to newlines.
Each 'word' is now on a separate line.
grep is used to remove lines of whitespace.
sort sorts the list of 'words' into alphabetical order, and removes duplicates.
Finally, comm finds which of the words in the list are not in the given dictionary file (in this case, /usr/dict/words).

History

Example

See also