[linux-elitists] Recommendation: GNU Parallel

Shlomi Fish shlomif at shlomifish.org
Mon Nov 12 03:53:39 PST 2012


Hi all,

I'd like to recommend you take a look at GNU Parallel, which is a command line
tool, similar to xargs, for easy command line parallelisation:

* http://www.gnu.org/software/parallel/

* http://en.wikipedia.org/wiki/GNU_parallel

GNU Parllel works very similarly to xargs with some additional flags.

A simple example would be:

	$ seq 1 100 | parallel echo "{}"

Which would just echo the arguments. You can run several jobs using -j4:

	$ seq 1 100 | parallel -j4 echo "{}"

If you want to distribute across the network you can use sshlogin:
 
	$ seq 1 100 | parallel --sshlogin 4/sh --sshlogin 2/lap echo "{}"

This means to give two 4 jobs for "sh" and 2 jobs for "lap" simultaneously.

Now one set up I noticed that works nicely for me is to process the jobs in the
sequence based on their sequential number and while outputting and inputting to
files as a function of {} (You can use printf(1), expr, and/or $((...)) for
that). So, for example, I have written this script:

seq 0 319 | parallel --sshlogin 4/sh --sshlogin 2/lap \
    "$HOME"/apps/fcs/bin/freecell-solver-fc-pro-range-solve \
        "\$(({}*100+1))" "\$((({}+1)*100))" 1 \
        --read-from-file
4,/home/shlomif/progs/freecell/git/fc-solve/fc-solve/source/Presets/testing-presets/mfi-with-2-more-scans.sh
--flares-choice fcpro \> \
"$HOME"/Arcs/FC_SOLVE_SUMMARIZE_RESULTS/mfi-with-two-more-scans-flares-choice-fcpro.fc-pro-dump__\$\(printf
\"%06d\" \"{}\"\).txt \; echo "Finished {}"

The Wikipedia page claims that parallel will aggregate the output of the tasks
based on their order, but I witnessed to the contrary, so I just put every
output in its own file. Here I used sshfs, but it can also be done using "scp"
or a pipe to «ssh myhost cat > result».
 
As with most other GNU software, GNU Parallel is FOSS, under the GPLv3, which
should be OK for most command-line needs, as long as one does not accept any
draconian interpretations of the GPL like Nmap's (see
https://svn.nmap.org/nmap/COPYING ).

One fact that surprised me about GNU Parallel was that it is written in Perl.
It's not that I don't like Perl or think it is unsuitable for writing such a
tool, it's just that I preassumed it was written in C.

Anyway, using GNU parallel can really speed up the processing, especially given
today's proliferation of multi-core machines.

Regards,

	Shlomi Fish


-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Original Riddles - http://www.shlomifish.org/puzzles/

One thing I could never understand is why in Microsoft Word, it often happens
that I press enter… and the font changes.

Please reply to list if it's a mailing list post - http://shlom.in/reply .


More information about the linux-elitists mailing list