Reputation: 20862
09/27/2009 19:48:00 Departure Location
I am trying to match and substitute the given line in a text file. The length of the text after date and time can vary. I am reading the file line by line and I need the final output to be printed as--
Date=> 09/27/2009
Time=> 19:48:00
Text=> Departure Location
I have tried to do the substitutions in one pass as follows-
if($line =~ m/(\d+)\/(\d+)\/(\d+)\h{1}(\d+):(\d+):(\d+)/){
$line =~ s/(\[a-zA-Z])/\nText=> $1/;
$line =~ s/(\d+)\/(\d+)\/(\d+)/\nDate=> $1\/$2\/$3/;
$line =~ s/\h{1}(\d+):(\d+):(\d+)/\nTime=> $1\:$2\:$3/;
print FH "$line\n";
}
But all I am getting is this-
Date=> 09/27/2009
Time=> 19:48:10 Departure Location
I know there is a problem in matching the Text
but I am not able to fix it. I am still a Perl beginner. Any help is appreciated. Thanks!
Upvotes: 2
Views: 4810
Reputation: 126722
Cramming as much functionality into a small space only contributes to the reputation Perl has for being incomprehensible.
This code seems much clearer to me
$line = <<END if $line =~ m|^(\d\d/\d\d/\d{4}) \s+ (\d\d:\d\d:\d\d) \s+ (.*)|x;
Date=> $1
Time=> $2
Text=> $3
END
Upvotes: 2
Reputation: 385789
You're doing too much work in your parser.
my ($date, $time, $text) = split(' ', $_, 3);
say "Date=> $date";
say "Time=> $time";
say "Text=> $text";
Upvotes: 2
Reputation: 13942
This pattern in particular is giving you trouble:
$line =~ s/(\[a-zA-Z])/\nText=> $1/;
There are a few problems with it. First, the backslash in front of the left bracket: \[
, is escaping the bracket so that your character class isn't a character class at all, but rather the literal text, "[a-zA-Z]
". Second, there is no "whitespace" permitted in your text match, so if the text portion of the string contains any space characters, (or punctuation) it will also fail to match. Third, there is no quantifier, so it will only match a single character. A final note is that it should probably be anchored to the end of the string. It might work like this (but don't use it, read on instead):
$line =~ s/([a-zA-Z\s]+)$/\nText=> $1/;
But there's probably a better solution. It can all be done in one pass without losing clarity. To me it starts to make more sense if you capture larger segments:
$string =~ s{^
(\d\d/\d\d/\d{4})\s # The date.
(\d\d:\d\d:\d\d)\s # The time.
(.+)$ # The rest (the text).
}{Date=> $1\nTime=> $2\nText=> $3}x;
As is usually the case, the /x modifier facilitates easier to read code.
There are some good resources available for getting a handle on Perl's regular expressions. I would suggest starting with perldoc perlretut, which is "a basic tutorial on understanding, creating and using regular expressions in Perl."
Using named captures can also add a degree of clarity, especially as your regexes become more complex:
$string =~ s{
^
(?<date>\d\d/\d\d/\d{4})\s
(?<time>\d\d:\d\d:\d\d)\s
(?<text>.+)
$
}
{Date=> $+{date}\nTime=> $+{time}\nText=> $+{text}}x;
Upvotes: 4
Reputation: 118128
split with a limit would work nicely here. The pairwise
is not strictly necessary, but helped me avoid a loop:
#!/usr/bin/env perl
use strict; use warnings;
use feature 'say';
use List::MoreUtils qw( pairwise );
my $input = q{09/27/2009 19:48:00 Departure Location};
my @fields = qw(Date Time Text);
my @values = split ' ', $input, @fields;
{
no warnings 'once';
say join("\n", pairwise { "$a=> $b" } @fields, @values);
}
Output:
Date=> 09/27/2009 Time=> 19:48:00 Text=> Departure Location
Upvotes: 5