Reputation: 69
I am pulling a simple string from between two XML-like tags, but the match is being returned as an array instead of a variable. I am using the following code:
$finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;
And the value of $inFileLine
is:
<recordNum>SW001</recordNum><state>Assigned</state><title>Fix Something</title>
When I run this code, a "1" is stored in $finishState
. When I change $finishState
to @finishState
the value "Assigned" is stored properly.
I'm unsure why and how to fix this. I'm absolutely not able to use an XML parser.
While having the value I need in an array doesn't kill me I would like to find out why this is happening and modify my regexp to correctly populate the variable. I also considered using grep, sed, awk, etc. but a match seems like a concise and clean way to do this.
Upvotes: 1
Views: 88
Reputation: 3535
It is called context. Perl is context based language, the result given by operand is based on which context you are evaluating it.
There are two types of context in perl.
Lists are collection of scalars.We use arrays and hashes to name them.
my $finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;
In this case you are evaluating the expression in scalar context which is giving you the boolean value whether it is matched or not i.e 1(matched)
in your case..
my @finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;
In this case you are evaluating the expression as array so it will give you all the matches in the array.
So, you know there is only one match and you want to store it into scalar use parenthesis
to evaluate it in list context.
i.e
my ($finishState) = $inFileLine =~ m(<State>(.*?)<\/State>)g;
Now $finishState
will contain your match.
If there is more than one match, then $finishState
will contain the first match. Check this and this node for more information on contexts.
Upvotes: 3
Reputation: 3029
Usually you would refer to $1
to see the content of the first matching parentheses:
$inFileLine = '<recordNum>SW001</recordNum><state>Assigned</state><title>Fix Something</title>';
$inFileLine =~ m(<State>(.*?)<\/State>)i;
$finishState = $1;
print $finishState;
outputs
Assigned
perlrequick states that
In list context, //g returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regex.
But the usual way would be to check the return value of the regex to find out whether there is any match, and to refer to $1
, $2
etc. to see the matches.
Upvotes: 1
Reputation: 118645
$finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;
evaluates the regular expression in scalar context, and populates $finishState
with a true (1) or false (""
) value.
@finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;
or even
($finishState) = $inFileLine =~ m(<State>(.*?)<\/State>)g;
evaluate the regular expression in list context. The distinction between scalar context and list context is important in Perl, and one of the greatest sources of confusion to new Perl programmers. Many functions and operations behave differently in the two different contexts, and often the only way to be sure what an operation is supposed to do in a particular context is to read the docs.
In this case, @finishState
will be populated by a list of all strings matching the capture group in the regular expression (i.e., all strings of length 0 of more enclosed by <State>
and </State>
tags), which in your example is a list of one element with the value Assigned
.
Upvotes: 3