coolent
coolent

Reputation: 21

perl split multiple commas in line

I am trying to split these values with colon separated

my input:

user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"

I am using this code block

while ( my $line = <IN> ) {
    chomp $line;
    print "$line\n";
    my @values = split( /\s+/, $line );

    foreach $data (@values) {
        chomp $data;
        ( $key, $value ) = split( /=/, $data );
        $key =~ s/\s+//g;
        $key =~ s/"//g;
    }
}

I am receiving this output, it take the space between the values, how to split the keys and values exactly from the above input

_1;
Linux
x86_64;
rv:23.0)
Gecko/20100101es,OU
(X1

Thanks in Advance

Upvotes: 1

Views: 115

Answers (3)

Miller
Miller

Reputation: 35198

You can use the perlretut - Alternative capture group numbering to capture values as either quote enclosed or non-spaces.

Then because the capture groups are arranged in key value pairs, it's possible to directly initialize your hash like so:

use strict;
use warnings;

while (<DATA>) {
    chomp;
    my %hash = /\G([^=]+)=(?|"([^"]*)"|(\S*))\s*/g;

    use Data::Dump;
    dd \%hash;
}

__DATA__
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"

Outputs:

{
  message    => "Authentication success",
  request_id => "bbfd6a1f-90c4-45g52-9e7c-db5",
  user_agent => "Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0",
}

Live Demo

Upvotes: 0

Patrick J. S.
Patrick J. S.

Reputation: 2935

This solution makes use of the (?|) matching groups introduced in perl 5.10 (I think). If you don't want to save into a hash, you can extend the line with the while loop. inside the while, the key is in $1 and the value is in $2.

#!/usr/bin/env perl

use warnings;
use strict;
use 5.01;

while (<DATA>){
  chomp;
  my %header;
  $header{$1} = $2 while (/\G\s*(\S+)=(?|"([^"]*)"|(\S*))/g); #extend here
  printf "%9s => %s\n", $_, $header{$_} for keys %header;
}


__DATA__
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"

This prints:

message    => Authentication success
user_agent => Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0
request_id => bbfd6a1f-90c4-45g52-9e7c-db5

If the quoting gets more complex, you should look at Text::Balanced with it's extract_quotelike routine.

Upvotes: 0

mpapec
mpapec

Reputation: 50637

Assuming that " would not appear as valid value character,

my %hash;
while (my $line = <IN>)
{
  $hash{$1} = ($2 // $3) while $line =~ /(\w+)=(?: "(.+?)" | (\S+) )/xg;
}

Upvotes: 1

Related Questions