Jump to content

gNU parallel

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Wikidrone (talk | contribs) at 21:23, 15 September 2010 (this shit is written in perl!). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

GNU parallel is a command-line driven utility for Linux or other Unix-like operating systems which allows the user to execute shell scripts in parallel. GNU parallel is a free software, written in perl. It is available[1] under the terms of GPLv3.

Usage

The most common usage is to replace the shell loop, for example

   (for x in `cat list` ; do 
       do_something $x
   done) | process_output

to the form of

   cat list | parallel do_something | process_output

where the file list contains arguments for do_something and where process_output may be empty.

Scripts using GNU parallel is often easier to read than scripts using pexec.

The program GNU parallel features also

  • grouping of standard output and standard error so the output of the parallel running jobs do not run together;
  • retaining the order of output to remain the same order as input;
  • dealing nicely with file names containing special characters such as space, single quote, double quote, ampersand, and UTF-8 encoded characters;

By default, GNU parallel runs 9 jobs in parallel, but using -j+0 parallel can be made to detect the number of CPUs and use all of them.

An introduction video to GNU Parallel can be found on YouTube

Examples

 find . -name "*.foo" | parallel grep bar

The above is equivalent to:

 grep bar `find . -name "*.foo"`

Note that the above command uses backticks (`), not single quotes ('). It searches in all files in the current directory and its subdirectories which end in .foo for occurrences of the string bar. The parallel command will work as expected unless a file name contains a newline. In order to avoid this limitation one may use:

 find . -name "*.foo" -print0 | parallel -0 grep bar

The above command uses GNU specific extensions to find to separate filenames using the null character;


 find . -name "*.foo" | parallel -X mv {} /tmp/trash

The above command uses {} to tell parallel to replace {} with the argument list.

 find . -maxdepth 1 -type f -name "*.ogg" | parallel -X -r cp -v -p {} /home/media

The command above does the same as:

 cp -v -p *.ogg /home/media

however, the former command which uses find/parallel/cp is more resource efficient and will not halt with an error if the expansion of *.ogg is too large for the shell.

References