Tuesday, March 11, 2008

Running external commands in Perl

Perl enables you to call external command line utilities.
Whenever possible, avoid calling external commands
* Perl supports a large number of built in functions
* External commands are generally not portable
* Often more time consuming (process setup/teardown overhead)

There are several methods to execute external commands

* The open() function
* The system() function
* Back-quotes
* The fork() & exec() functions

All of these methods have different behaviour, so you should choose which one to use depending of your particular need. In brief, these are the recommendations:

system() : You want to execute a command and don't want to capture its output
exec : You don't want to return to the calling perl script
backticks : You want to capture the output of the command
open : You want to pipe the command (as input or output) to your script

The native shell is used to execute the command line.

Using open()

Use open() when you want to:

- capture the data of a command (syntax: open("command |"))

- feed an external command with data generated from the Perl script (syntax: open("| command"))

Examples :

* Read the output from one or more commands

open( README, "ls -l |" );
$line = ;

* Write to the input of one or more commands

open( WRITEME, "| Mail -s 'test' joe@foo.com" );
print WRITEME "Dear John,\n";

#-- list the processes running on your system
open(PS,"ps -e -o pid,stime,args |") || die "Failed: $!\n";
while ( )
{
#-- do something here
}

#-- send an email to user@localhost
open(MAIL, "| /bin/mailx -s test user\@localhost ") || die "mailx failed: $!\n";
print MAIL "This is a test message";

Using system()

system() executes the command specified. It doesn't capture the output of the command.

system() accepts as argument either a scalar or an array. If the argument is a scalar, system() uses a shell to execute the command ("/bin/sh -c command"); if the argument is an array it executes the command directly, considering the first element of the array as the command name and the remaining array elements as arguments to the command to be executed.

For that reason, it's highly recommended for efficiency and safety reasons (specially if you're running a cgi script) that you use an array to pass arguments to system().

Examples :

#-- calling 'command' with arguments
system("command arg1 arg2 arg3");

#-- better way of calling the same command
system("command", "arg1", "arg2", "arg3");

The return value is set in $?; this value is the exit status of the command as returned by the 'wait' call; to get the real exit status of the command you have to shift right by 8 the value of $? ($? >> 8).

If the value of $? is -1, then the command failed to execute, in that case you may check the value of $! for the reason of the failure.

system("command", "arg1");
if ( $? == -1 )
{
print "command failed: $!\n";
}
else
{
printf "command exited with value %d", $? >> 8;
}

# The return value is the integer value returned by the shell

$err = system( "ls -l | more" );

# Here the more command can be used becuase the new shell inherits STDIN, STDOUT, and STDERR.

Using backticks (``)

In this case the command to be executed is surrounded by backticks. The command is executed and the output of the command is returned to the calling script.

In scalar context it returns a single (possibly multiline) string, in list context it returns a list of lines or an empty list if the command failed.

The exit status of the executed command is stored in $? (see system() above for details).

Examples :

#-- scalar context
$result = `command arg1 arg2`;

#-- the same command in list context
@result = `command arg2 arg2`;

Using exec()

The exec() function executes the command specified and never returns to the calling program, except in the case of failure because the specified command does not exist AND the exec argument is an array.

Like in system(), is recommended to pass the arguments of the functions as an array.

PATH Environment Variable
All methods for executing external commands use the $ENV{PATH} environment value to locate "unqualified" commands.Unqualified commands have no explicit full path specification.The $ENV{PATH} environment value is initialized from your user environment when the Perl interpreter starts.The structure of the $ENV{PATH} environment value is a colon-separated list of search paths.