Thursday, September 02, 2010

PERL REGULAR EXPRESSIONS: MATCHING TEXT WITH REGULAR EXPRESSIOS, ~M

In Perl, The “m” means to attempt a regular expression match. The requrlar expressions, or search patterns, are put between to forwardslashes, “/../”. And the string that to be searched is linked by” =~”. Truse, if you want to search any digits from a string, you can do the following:

my $varString="George Washington was born on February 22, 1732";
print "Search string \"$varString\" for digits\n";
while ( $varString=~ m/(\d+)/g) {
print "Found digits: $1\n";
}

And the output will be:

Search string "George Washington was born on February 22, 1732" for digits
Found digits: 22
Found digits: 1732

Here the search pattern is “/(\d+)/”, “m” means match and ‘=~” tells perl to link to the string “$varString”. Lastly, “g” means do a global search on the string (from left to right). The search resulf, is stored in “$1”.

If we changed the code a little bit, without using the while loop and without using global search “g”::

my $varString="George Washington was born on February 22, 1732";
print "Search string \"$varString\" for digits\n";
$varString=~ m/(\d+)/;
print "Found digits: $1\n";

The outout will be:

Search string "George Washington was born on February 22, 1732" for digits
Found digits: 22

Perl find the first digits and stop and the resulf still store in $1.

Now, assuming that we forgot to use “~” and have the codes as the following:

my $varString="George Washington was born on February 22, 1732";
print "Search string \"$varString\" for digits\n";
$varString= m/(\d+)/;
print "Found digits: $1\n";

The output are:

Search string "George Washington was born on February 22, 1732" for digits
Use of uninitialized value in pattern match (m//) at ./perl_rex.pl line 8.
Use of uninitialized value in concatenation (.) or string at ./perl_rex.pl line 9.
Found digits:

The reason is when the “~” is omitted, perl is looking for a match of the regex in $_ and stores the search resulf to $varString.
Lets test it using the following codes:

my $varString="George Washington was born on February 22, 1732";
print "Search string \"$varString\" for digits\n";
$_="George Washington was born on February 22, 1732";
$varString= m/(\d+)/;
print "Found digits: $1\n";
print "Found digits: $varString\n";

And the output are:

Search string "George Washington was born on February 22, 1732" for digits
Found digits: 22
Found digits: 22

In fact, “m” is not reuested as long as you have used “~”, but using “~m” make the codes a bit easy to read, in my opions.

my $varString="George Washington was born on February 22, 1732";
print "Search string \"$varString\" for digits\n";
$varString=~ /(\d+)/;
print "Found digits: $1\n";

References:


No comments: