Reputation: 17
Input as GMF File :
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|7500000|234446
In the perl code, I am using the below to extract the strings from the line
if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|.*\|(.*?)$/)
{
$tag=$1;
$lineTxt=$2;
$usage = $3;
$amt = $4;
}
output:
tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Non-Smartphone Package usage :: 3126 amt ::
tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Smartphone Package usage :: 3126 amt ::
tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Non-Smartphone Package - Charged usage :: 3126 amt :: 234446
How can I retrieve/print the units used is MB or GB .Can anyone please help me out.
Upvotes: 0
Views: 42
Reputation: 53508
Given what you have there:
if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|(.*?)\|(.*?)$/)
{
$tag=$1;
$lineTxt=$2;
$usage = $3;
$units = $4;
$amt = $5;
}
But I'd suggest that's not the best way to approach this problem - I'd be thinking using split
and processing your first field separately.
Something like this maybe:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my @fields = qw ( tag lineTxt usage units amt );
while (<DATA>) {
my ( $first_field, @record ) = split '\|';
#split the first field on _just_ the first space.
unshift( @record, $first_field =~ m/^(\w+) (.*)$/ );
#use a hash slice to put that record into a hash of named keys.
my %data;
@data{@fields} = @record;
print Dumper \%data;
# can of course, make this an array of hashes quite easily.
}
__DATA__
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| |
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|7500000|234446
This prints each record as:
$VAR1 = {
'units' => 'GB',
'tag' => 'CUSTEVSUMMROW_GPRS',
'amt' => '7500000',
'usage' => '3126',
'lineTxt' => 'GPRS - Nova Subscriber Non-Smartphone Package - Charged'
};
Upvotes: 1
Reputation: 242343
You don't capture the column after \d+
. Add parentheses to do so.
.*
is greedy, i.e. it matches as much as it can. Add a ?
to make it frugal:
if ($line =~ /^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|(.*?)\|/)
You can also rewrite the alternative as
(CUSTEVSUMMROW(?:_GPRS)?)
Upvotes: 3