Reputation: 5059
Tried to use Text::CSV_XS to parse some logs. However, the following code doesn't do what I expected -- split the line into pieces according to separator " "
.
The funny thing is, if I remove the double quote in the string $a
, then it will do splitting.
Wonder if it's a bug or I missed something. Thanks!
use Text::CSV_XS;
$a = 'id=firewall time="2010-05-09 16:07:21 UTC"';
$userDefinedSeparator = Text::CSV_XS->new({sep_char => " "});
print "$userDefinedSeparator\n";
$userDefinedSeparator->parse($a);
my $e;
foreach $e ($userDefinedSeparator->fields) {
print $e, "\n";
}
EDIT:
In the above code snippet, it I change the =
(after time
) to be a space, then it works fine. Started to wonder whether this is a bug after all?
$a = 'id=firewall time "2010-05-09 16:07:21 UTC"';
Upvotes: 1
Views: 2556
Reputation: 126722
You have confused the module by leaving both the quote character and the escape character set to double quote "
, and then left them embedded in the fields you want to split.
Disable both quote_char
and escape_char
, like this
use strict;
use warnings;
use Text::CSV_XS;
my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';
my $space_sep = Text::CSV_XS->new({
sep_char => ' ',
quote_char => undef,
escape_char => undef,
});
$space_sep->parse($string);
for my $field ($space_sep->fields) {
print "$field\n";
}
output
id=firewall
time="2010-05-09
16:07:21
UTC"
But note that you have achieved exactly the same things as print "$_\n" for split ' ', $string
, which is to be preferred as it is both more efficient and more concise.
In addition, you must always use strict
and use warnings
; and never use $a
or $b
as variable names, both because they are used by sort
and because they are meaningless and undescriptive.
Update
As @ThisSuitIsBlackNot
points out, your intention is probably not to split on spaces but to extract a series of key=value
pairs. If so then this method puts the values straight into a hash.
use strict;
use warnings;
my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';
my %data = $string =~ / ([^=\s]+) \s* = \s* ( "[^"]*" | [^"\s]+ ) /xg;
use Data::Dump;
dd \%data;
output
{ id => "firewall", time => "\"2010-05-09 16:07:21 UTC\"" }
Update
This program will extract the two name=value
strings and print them on separate lines.
use strict;
use warnings;
my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';
my @fields = $string =~ / (?: "[^"]*" | \S )+ /xg;
print "$_\n" for @fields;
output
id=firewall
time="2010-05-09 16:07:21 UTC"
Upvotes: 3
Reputation: 67900
If you are not actually trying to parse csv data, you can get the time field by using Text::ParseWords
, which is a core module in Perl 5. The benefit to using this module is that it handles quotes very well.
use strict;
use warnings;
use Data::Dumper;
use Text::ParseWords;
my $str = 'id=firewall time="2010-05-09 16:07:21 UTC"';
my @fields = quotewords(' ', 0, $str);
print Dumper \@fields;
my %hash = map split(/=/, $_, 2), @fields;
print Dumper \%hash;
Output:
$VAR1 = [
'id=firewall',
'time=2010-05-09 16:07:21 UTC'
];
$VAR1 = {
'time' => '2010-05-09 16:07:21 UTC',
'id' => 'firewall'
};
I also included how you can make the data more accessible by adding it to a hash. Note that hashes cannot contain duplicate keys, so you need a new hash for each new time
key.
Upvotes: 3