next up previous contents index
Next: Perl data structures Up: The Perl modules Previous: The Perl modules

A crash-course in Perl

  Even though, perl is a comparatively younggif programming language it, is already pretty popular on UNIX gif systemsgif. At the first view,  perl looks like another shell script programming language, and almost every shell script can be rewritten in  perl . It has also been expanded by a lot of feature which are normally only found in high level programming languages.

Perl provides a nice programming interface to many features which were sometimes difficult to use in other languages. For example, to analyze the files exported from dBASE  a large amount of text has to be scanned and certain fields have to be cut out of each line based on a pattern represented as a regular expression.

Another strength of  perl is its ability to cooperate with other programs like the small UNIX toolsgif. Those two features will be demonstrated in a first example . Let's say, we want some statistics about the users on a certain system. We want to use the information given by UNIX  command last. It is our interest, how often each user logged in, and how long he was working. The output of the command last  has the following format:

login  tty  hostname  date  start  end  time logged in.

A sample line:

melzer tty1 turing.mathematik Sat Jun 15 21:05 - 23:35 (02:29)

The following example will do the desired. It reads the output of the last command and prints an entry for each user on the system, showing the total login time and the number of logins for eachgif.

 1   #!/usr/local/bin/perl -w
 2
 3   open(DB,"last |") or die "Could not execute last :-(";
 4   while (<DB>) {
 5      if (/^(\S*)\s*.*\((.*):(.*)\)$/) {
 6         $hours{$1} += $2;
 7         $minutes{$1} += $3;
 8         $logins{$1}++;
 9      }
10   }
11
12   foreach $user (sort(keys %logins)) {
13      $hours{$user} += int($minutes{$user} / 60);
14      $minutes{$user} %= 60;
15      $totaltime =
           sprintf("%02d:%02d", $hours{$user}, $minutes{$user});
16      write;
17   }
18 
19   format STDOUT_TOP =
20   User           Total login time     Total logins   
21   -------------- -------------------- --------------------
22   .
23   format STDOUT =
24   @<<<<<<<<<<<<< @<<<<<<<<            @####
25   $user,         $totaltime,          $logins{$user}
26   .

First, the filehandle DB is assigned to the reading end of a pipe the last command writes to. Without the pipe symbol | the file last would be opened for reading. If an error occures, the program terminates with an error message.

In line 4 is the head of a while loop which ends in line 10 and is executed until its condition is falsegif. However, the conditional expression ``<DB>'' looks funny. This expression uses the filehandle DB to read one line from it. It is false if there is nothing left to read. In  perl a lot of commands allow the absence of a variable and use $_gif instead. So in this case, the line read will be assigned to the variable $_.

The evil looking mess on line 5 is just an if statementgif. So lines 6-8 are executed if the condition within the braces is true. The strange looking condition is just a regular expression which is true if a string match is possible. Again, since no input is given, the contents of the variable $_ is used for the pattern match. This regular expression  filters the login name and the time he was logged in out of the line provided by the last command and stores the name in $1, the hours in $2, and the minutes in $3gif. The rest of the information is ignored. To help reading regular expressions , some of the most important characters in such an expression have been compiled in a listgif:

\Quote the next metacharacter
^Match the beginning of the line
.Match any character (except newline)
$Match the end of the line
|Alternation
()Grouping
[]Character class
*Match 0 or more times
+Match 1 or more times
?Match 1 or 0 times
{n}Match exactly n times
{n,}Match at least n times
{n,m}Match at least n but not more than m times
*?Match 0 or more times
+?Match 1 or more times
??Match 0 or 1 time
{n}?Match exactly n times
{n,}?Match at least n times
{n,m}?Match at least n but not more than m times
\wMatch a ``word'' character (alphanumeric plus ``_'')
\WMatch a non-word character
\sMatch a whitespace character
\SMatch a non-whitespace character
\dMatch a digit character
\DMatch a non-digit character

These three pieces of information are processed in the body of the if statement in lines 6-8 through the use of a hashgif. Hashes are similar to normal arrays, but instead of integers any scalar value may be used as a subscript. In this example, the user's login name is used to access the three hashes. As an example, referencing the variable  $logins{'melzer'} will return how often the user melzer was logged in. In  perl , the increment operators  like += are equivalent to the corresponding operators in C.

The next interesting spot is in line 12 -- the foreach statement. It iterates over a list of values, and sets the variable $user to each element of the list in turngif. Once more, if the variable $user would be omitted, $_ would be used automatically. The list required for the foreach loop is generated by the function keys which returns a list of all keys used to index a hash which is in this case a list of all login names. This list gets sorted by the command sortgif first. Therefore, the program is looping over a sorted list of login names, assigning each name in turn to the variable $user.

After the probably self-explaining lines 13-14, line 15 has been used to demonstrate perl's output abilitiesgif. First, the total login time is formated nicelygif, and then another nice feature of  perl is demonstrated. The write command generates a small report using the definition at the end of the program. The STDOUT_TOP definition in line 20-23 describes the header of the report, to be printed at the top of each page of the output. In this case lines 21 and 22 are printed without any substitutions. The STDOUT format starting in line 24 describes the look of each line of the output. It is used every time the write command is executed. Like the parameters of the printf command, it can be subdivided in two parts. The first line is similar to the format string, and the second line contains the variables to be used. Each field used in the format part starts with a @ character, followed by information about the justification. @<<<< for instance signifies five left-justified text characters, while @#### specifies a numeric field with five digits which are displayed right-justified.

This little program might produce an output like thisgif:

User           Total login time     Total logins   
-------------- -------------------- --------------------
atl            01:46                    1
melzer         12:34                   11
meneghin       00:01                    1
plonka         07:02                    5
swg            447:51                  35

Some additional information:




next up previous contents index
Next: Perl data structures Up: The Perl modules Previous: The Perl modules

Ingo Melzer
Mon Aug 5 15:12:01 MET DST 1996