user2837756
user2837756

Reputation: 39

Perl - Getting value from comma separated line

I've got a perl file parser I am trying to rewrite. This is a dynamic parser, and I need to extract a value from a comma separated line.

The line I want to get one value from looks something like this:

ENTRYNAME-8,44544,99955,52,156,15:16:16,15:19:16

(This is the only line in the each parsed file that starts withENTRYNAME-. Everything after the - changes for each file being parsed)

I want the value after the second comma. (99955 in the example above)

I have tried the following without any luck:

if (/ ENTRYNAME-\((.*)\,(.*)\,(.*)\)/ ) 
{
    $entry_nr = $3;
    print "entry number = $entry_nr";
    next;
}

Upvotes: 1

Views: 964

Answers (3)

Miller
Miller

Reputation: 35208

Whenever possible separate parsing from processing and validating your data.

In this case, if you have comma separated values, go ahead and separate those values. Then worry about filtering your data. Whether you use Text::CSV for parsing is a separate issue, although probably a good idea.

use strict;
use warnings;

while (<DATA>) {
    chomp;
    my @cols = split ',';

    if ($cols[0] =~ /^ENTRYNAME/) {
        print $cols[2], "\n";
    }
}

__DATA__
ENTRYNAME-8,44544,99955,52,156,15:16:16,15:19:16

Outputs:

99955

Upvotes: 1

Brett Schneider
Brett Schneider

Reputation: 4103

split it into an array and address directly:

my @a = split /,/, $_;
print $a[2];

what happens here is that whatever is in $_ (typically from a for (@allmylines) {-loop) will be split at each occurence of ,, placing them all into an array (@a) and removing the ,. then you may address the fields in the arrays, starting with 0 for the first field. thus if you like to address the third field, use $a[2] to retrieve the third item.

Upvotes: 1

TLP
TLP

Reputation: 67900

The problem is that your first capture string .* is greedy, so it will consume all of your string. It will then backtrack to find two commas and as a result match from the end.

Also:

  • You are matching literal parentheses \( for some strange reason. Since you do not have any such, those will never match.
  • You do not need to escape commas \,
  • You cannot have a random space in your regex / ENTRY... unless you have one in your target string
  • You do not need to capture strings that you are not going to use

A simple fix is to use a stricter capture group (including the points above):

if (/ENTRYNAME-\d+,\d+,(\d+)/ ) 

This will capture into $1.

As mpapec points out in the comment, you may wish to use Text::CSV to parse CSV data. It will be a lot safer. If your data is simple enough, this solution will do.

Upvotes: 1

Related Questions