spaceboy2020
spaceboy2020

Reputation: 31

Why is this code skipping every other input line

I am trying to loop over the data via a regex and parsing each line by the date (i.e. 7/9/2019). However, the results omit every other line in the input data.

Tried this on Windows and Mac (Terminal shell) to the same behaviour consistently.

my $file;

{
    local $/ = undef;
    $file = <DATA>;
}

while ($file =~ m/(\d\/\d\/\d{4}.*?)\d\/\d\/\d{4}/gs) {
    print "*$1*\n";
}

__DATA__
9/7/2019 20:35:17,dog
9/7/2019 21:06:16,cat
9/7/2019 22:32:15,parrot
9/7/2019 22:32:15,snail
9/7/2019

I expect the following:

*9/7/2019 20:35:17,dog*
*9/7/2019 21:06:16,cat*
*9/7/2019 22:32:15,parrot*
*9/7/2019 22:32:15,snail*

but instead get the following:

*9/7/2019 20:35:17,dog
*
*9/7/2019 22:32:15,parrot
*

Upvotes: 2

Views: 108

Answers (2)

ikegami
ikegami

Reputation: 386696

Your pattern matches two dates, so the next match will proceed from there, effectively skipping the line.

There's no point in checking if the next line starts with a date, so you could use

while (<DATA>) {
   next if !m{^(\d+/\d+/\d+)};

   print "*$1*\n";
}

If you weren't reading from a file:

while ($file =~ m{^(\d+/\d+/\d+)}mg) {
   print "*$1*\n";
}

If every line starts with a date, you could even use

while (<DATA>) {
   my @fields = split;
   print "*$fields[0]*\n";
}

If you weren't reading from a file:

while ($file =~ /^(.*)/mg) {
   my @fields = split;
   print "*$1*\n";
}

The lack of /s means that . won't match a line feed, which means it won't match beyond the end of the line.

Upvotes: 2

King11
King11

Reputation: 1229

You're not encapsulating the end of your pattern. Change your while loop to:

 while ($file =~ m/(\d\/\d\/\d{4}.*?)(?=\R\d\/\d\/\d{4})/gs) {
    print "*$1*\n";
}

That should work for you. Test it out at: https://rextester.com/l/perl_online_compiler

Upvotes: 3

Related Questions