Reputation: 21
I am trying to split these values with colon separated
my input:
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"
I am using this code block
while ( my $line = <IN> ) {
chomp $line;
print "$line\n";
my @values = split( /\s+/, $line );
foreach $data (@values) {
chomp $data;
( $key, $value ) = split( /=/, $data );
$key =~ s/\s+//g;
$key =~ s/"//g;
}
}
I am receiving this output, it take the space between the values, how to split the keys and values exactly from the above input
_1;
Linux
x86_64;
rv:23.0)
Gecko/20100101es,OU
(X1
Thanks in Advance
Upvotes: 1
Views: 115
Reputation: 35198
You can use the perlretut - Alternative capture group numbering to capture values as either quote enclosed or non-spaces.
Then because the capture groups are arranged in key value pairs, it's possible to directly initialize your hash like so:
use strict;
use warnings;
while (<DATA>) {
chomp;
my %hash = /\G([^=]+)=(?|"([^"]*)"|(\S*))\s*/g;
use Data::Dump;
dd \%hash;
}
__DATA__
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"
Outputs:
{
message => "Authentication success",
request_id => "bbfd6a1f-90c4-45g52-9e7c-db5",
user_agent => "Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0",
}
Upvotes: 0
Reputation: 2935
This solution makes use of the (?|)
matching groups introduced in perl 5.10 (I think). If you don't want to save into a hash, you can extend the line with the while
loop. inside the while
, the key is in $1
and the value is in $2
.
#!/usr/bin/env perl
use warnings;
use strict;
use 5.01;
while (<DATA>){
chomp;
my %header;
$header{$1} = $2 while (/\G\s*(\S+)=(?|"([^"]*)"|(\S*))/g); #extend here
printf "%9s => %s\n", $_, $header{$_} for keys %header;
}
__DATA__
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"
This prints:
message => Authentication success
user_agent => Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0
request_id => bbfd6a1f-90c4-45g52-9e7c-db5
If the quoting gets more complex, you should look at Text::Balanced
with it's extract_quotelike
routine.
Upvotes: 0
Reputation: 50637
Assuming that "
would not appear as valid value character,
my %hash;
while (my $line = <IN>)
{
$hash{$1} = ($2 // $3) while $line =~ /(\w+)=(?: "(.+?)" | (\S+) )/xg;
}
Upvotes: 1