Dave
Dave

Reputation: 1

Perl script that parses CSV file excluding the contents enclosed in []

Hi there I am struggling with perl script that parses a an eight column CSV line into another CSV line using the split command. But i want to exclude all the text enclosed by square brackets []. The line looks like :

128.39.120.51,0,49788,6,SYN,[8192:127:1:52:M1460,N,W2,N,N,S:.:Windows:XP/2000 (RFC1323+, w+, tstamp-):link:ethernet/modem],1,1399385680

I used the following script but when i print $fields[7] it gives me N. one of the fields inside [] above.but by print "$fields[7]" i want it to be 1399385680 which is the last field in the above line. the script i tried was.

while (my $line = <LOG>) {
    chomp $line;
    my @fields=grep { !/^[\[.*\]]$/ } split ",", $line;
    my $timestamp=$fields[7];
    print "$fields[7]";
}

Thanks for your time. I will appreciate your help.

Upvotes: 0

Views: 109

Answers (2)

Miller
Miller

Reputation: 35198

Always include use strict; and use warnings; at the top of EVERY perl script.

Your "csv" file isn't proper csv. So the only thing I can suggest is to remove the contents in the brackets before you split:

use strict;
use warnings;

while (<DATA>) {
    chomp;
    s/\[.*?\]//g;
    my @fields = split ',', $_;
    my $timestamp = $fields[7];
    print "$timestamp\n";
}
__DATA__
128.39.120.51,0,49788,6,SYN,[8192:127:1:52:M1460,N,W2,N,N,S:.:Windows:XP/2000 (RFC1323+, w+, tstamp-):link:ethernet/modem],1,1399385680

Outputs:

1399385680

Obviously it is possible to also capture the contents of the bracketed fields, but you didn't say that was a requirement or goal.

Update

If you want to capture the bracket delimited field, one method would be to use a regex for capturing instead.

Note, this current regex requires that each field has a value.

chomp;
my @fields = $_ =~ /(\[.*?\]|[^,]+)(?:,|$)/g;
my $timestamp = $fields[7];
print "$timestamp";

Upvotes: 1

codehead
codehead

Reputation: 2115

Well, if you want to actually ignore the text between square brackets, you might as well get rid of it:

while ( my $line = <LOG> ) {
  chomp $line;
  $line =~ s,\[.*?\],,; # Delete all text between square brackets
  my @fields = split ",", $line;
  my $timestamp = $fields[7];
  print $fields[7], "\n";
}

Upvotes: 0

Related Questions