Parallel::ForkManager - A simple parallel processing fork manager |
Parallel::ForkManager - A simple parallel processing fork manager
use Parallel::ForkManager;
$pm = new Parallel::ForkManager($MAX_PROCESSES);
foreach $data (@all_data) { # Forks and returns the pid for the child: my $pid = $pm->start and next;
... do some work with $data in the child process ...
$pm->finish; # Terminates the child process }
This module is intended for use in operations that can be done in parallel where the number of processes to be forked off should be limited. Typical use is a downloader which will be retrieving hundreds/thousands of files.
The code for a downloader would look something like this:
use LWP::Simple; use Parallel::ForkManager;
...
@links=( ["http://www.foo.bar/rulez.data","rulez_data.txt"], ["http://new.host/more_data.doc","more_data.doc"], ... );
...
# Max 30 processes for parallel download my $pm = new Parallel::ForkManager(30);
foreach my $linkarray (@links) { $pm->start and next; # do the fork
my ($link,$fn) = @$linkarray; warn "Cannot get $fn from $link" if getstore($link,$fn) != RC_OK;
$pm->finish; # do the exit in the child process } $pm->wait_all_children;
First you need to instantiate the ForkManager with the ``new'' constructor. You must specify the maximum number of processes to be created. If you specify 0, then NO fork will be done; this is good for debugging purposes.
Next, use $pm->start to do the fork. $pm returns 0 for the child process, and child pid for the parent process (see also perlfunc(1p)/fork()). The ``and next'' skips the internal loop in the parent process. NOTE: $pm->start dies if the fork fails.
$pm->finish terminates the child process (assuming a fork was done in the ``start'').
NOTE: You cannot use $pm->start if you are already in the child process. If you want to manage another set of subprocesses in the child process, you must instantiate another Parallel::ForkManager object!
An optional $process_identifier can be provided to this method... It is used by the ``run_on_finish'' callback (see CALLBACKS) for identifying the finished process.
You can define callbacks in the code, which are called on events like starting a process or upon finish.
The callbacks can be defined with the following methods:
The paremeters of the $code are the following:
- pid of the process, which is terminated - exit code of the program - identification of the process (if provided in the "start" method) - exit signal (0-127: signal name) - core dump (1 if there was core dump at exit)
The parameters of the $code are the following:
- pid of the process which has been started - identification of the process (if provided in the "start" method)
The $code called in the ``start'' and the ``wait_all_children'' method also.
No parameters are passed to the $code on the call.
This small example can be used to get URLs in parallel.
use Parallel::ForkManager; use LWP::Simple; my $pm=new Parallel::ForkManager(10); for my $link (@ARGV) { $pm->start and next; my ($fn)= $link =~ /^.*\/(.*?)$/; if (!$fn) { warn "Cannot determine filename from $fn\n"; } else { $0.=" ".$fn; print "Getting $fn from $link\n"; my $rc=getstore($link,$fn); print "$link downloaded. response code: $rc\n"; }; $pm->finish; };
Example of a program using callbacks to get child exit codes:
use strict; use Parallel::ForkManager;
my $max_procs = 5; my @names = qw( Fred Jim Lily Steve Jessica Bob Dave Christine Rico Sara ); # hash to resolve PID's back to child specific information
my $pm = new Parallel::ForkManager($max_procs);
# Setup a callback for when a child finishes up so we can # get it's exit code $pm->run_on_finish( sub { my ($pid, $exit_code, $ident) = @_; print "** $ident just got out of the pool ". "with PID $pid and exit code: $exit_code\n"; } );
$pm->run_on_start( sub { my ($pid,$ident)=@_; print "** $ident started, pid: $pid\n"; } );
$pm->run_on_wait( sub { print "** Have to wait for one children ...\n" }, 0.5 );
foreach my $child ( 0 .. $#names ) { my $pid = $pm->start($names[$child]) and next;
# This code is the child process print "This is $names[$child], Child number $child\n"; sleep ( 2 * $child ); print "$names[$child], Child $child is about to get out...\n"; sleep 1; $pm->finish($child); # pass an exit code to finish }
print "Waiting for Children...\n"; $pm->wait_all_children; print "Everybody is out of the pool!\n";
Do not use Parallel::ForkManager with fork and wait. Do not use more than one copy of Parallel::ForkManager in one process!
Copyright (c) 2000 Szabó, Balázs (dLux)
All right reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
dLux (Szabó, Balázs) <dlux@kapu.hu>
Noah Robin <sitz@onastick.net> (documentation tweaks) Chuck Hirstius <chirstius@megapathdsl.net> (callback exit status, example) Grant Hopwood <hopwoodg@valero.com> (win32 port) Mark Southern <mark_southern@merck.com> (bugfix)
Parallel::ForkManager - A simple parallel processing fork manager |