Matt
Matt

Reputation: 69

Match returned as array instead of variable

I am pulling a simple string from between two XML-like tags, but the match is being returned as an array instead of a variable. I am using the following code:

$finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;

And the value of $inFileLine is:

<recordNum>SW001</recordNum><state>Assigned</state><title>Fix Something</title>

When I run this code, a "1" is stored in $finishState. When I change $finishState to @finishState the value "Assigned" is stored properly.

I'm unsure why and how to fix this. I'm absolutely not able to use an XML parser.

While having the value I need in an array doesn't kill me I would like to find out why this is happening and modify my regexp to correctly populate the variable. I also considered using grep, sed, awk, etc. but a match seems like a concise and clean way to do this.

Upvotes: 1

Views: 88

Answers (3)

Arunesh Singh
Arunesh Singh

Reputation: 3535

It is called context. Perl is context based language, the result given by operand is based on which context you are evaluating it.

There are two types of context in perl.

  1. Scalar context.
  2. List context.

Lists are collection of scalars.We use arrays and hashes to name them.

my $finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;

In this case you are evaluating the expression in scalar context which is giving you the boolean value whether it is matched or not i.e 1(matched) in your case..

my @finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;

In this case you are evaluating the expression as array so it will give you all the matches in the array.

So, you know there is only one match and you want to store it into scalar use parenthesis to evaluate it in list context.

i.e

my ($finishState) = $inFileLine =~ m(<State>(.*?)<\/State>)g;

Now $finishState will contain your match.

If there is more than one match, then $finishState will contain the first match. Check this and this node for more information on contexts.

Upvotes: 3

syck
syck

Reputation: 3029

Usually you would refer to $1 to see the content of the first matching parentheses:

$inFileLine = '<recordNum>SW001</recordNum><state>Assigned</state><title>Fix Something</title>';
$inFileLine =~ m(<State>(.*?)<\/State>)i;
$finishState = $1;
print $finishState;

outputs

Assigned

perlrequick states that

In list context, //g returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regex.

But the usual way would be to check the return value of the regex to find out whether there is any match, and to refer to $1, $2 etc. to see the matches.

Upvotes: 1

mob
mob

Reputation: 118645

$finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;

evaluates the regular expression in scalar context, and populates $finishState with a true (1) or false ("") value.

@finishState = $inFileLine =~ m(<State>(.*?)<\/State>)g;

or even

($finishState) = $inFileLine =~ m(<State>(.*?)<\/State>)g;

evaluate the regular expression in list context. The distinction between scalar context and list context is important in Perl, and one of the greatest sources of confusion to new Perl programmers. Many functions and operations behave differently in the two different contexts, and often the only way to be sure what an operation is supposed to do in a particular context is to read the docs.

In this case, @finishState will be populated by a list of all strings matching the capture group in the regular expression (i.e., all strings of length 0 of more enclosed by <State> and </State> tags), which in your example is a list of one element with the value Assigned.

Upvotes: 3

Related Questions