Cleber Goncalves
Cleber Goncalves

Reputation: 1996

Selectively splitting a string in Perl

New to Perl.

I need to parse a report that look like this:

2012-05-29@emaillocalpart@emaildomain@customerid@errormessage@messageid

I used:

my @fields = split(/@/, $line, 6);

Most of the time it works fine, but sometimes the error message will contain an email address and all text after the @ symbol on that email until the end of the string will end on my message id.

I thought about checking for the amount of @s and have a conditional parsing, but is there a better way?

EDIT:

The desired output is a list of strings, with the error message containing whatever came in it (including an occasional email address).

Since there are other applications using the same report I cannot change the separator or escape the output.

Sample lines on the report:

2012-05-29@[email protected]@AB99-5@440 4.4.1 Some error occurred@XYZ35
2012-05-29@[email protected]@ZZ88-6@550 5.1.1 <[email protected]>... User Unknow@GGH93
2012-05-29@[email protected]@YY88-0@550 5.1.1 [email protected] no such user@GGH93

Expected contents of @fields after parsing line 1:

2012-05-29
joedoe
example.com
AB99-5
440 4.4.1 Some error occurred
XYZ35

And after parsing line 2:

2012-05-29
foobar
invalid.com
ZZ88-6
550 5.1.1 <[email protected]>... User Unknow
GGH93

Upvotes: 6

Views: 448

Answers (4)

Qtax
Qtax

Reputation: 33928

Similar to daxim's answer, but another way of writing it:

my $re = '^' . '([^@]*)@'x4 . '(.*)@([^@]*)$';
my @fields = $line =~ /$re/; 

You may also want to do some error checking here:

my @fields = $line =~ /$re/ or die "can't parse '$line'";

Upvotes: 1

kevlar1818
kevlar1818

Reputation: 3125

This properly parses optional email addresses:

$str = '5-29@[email protected]@ZZ88-6@550 5.1.1 <[email protected]>... User Unknow@GGH93';
#$str= '2012-05-29@[email protected]@AB99-5@440 4.4.1 Some error occurred@XYZ35';

$str =~ s/(\<[^\>]+\>)/!!/; # replace an email address with !!
$email = $1; # store the email

@fields = split(/@/,$str); # split on @

s/!!/$email/ foreach (@fields); # find the old !! and replace with the email address

print STDERR map { "$_ \n" } @fields; # print fields to standard error

See it working here. This assumes you only have one optional email. With a little work it could be modified to work for a string with any number of < > delimited emails.

Upvotes: 1

Soz
Soz

Reputation: 963

If $teststr contains, for example: '2012-05-29@emaillocalpart@emaildomain@customerid@error@me@ssage@messageid';

the following code:

my @fields2=split('@',$teststr);
my @finalfields=@fields2[0 .. 3];
my $finalat=$#fields2-1;
my $errormessage=join('@',@fields2[4 .. $finalat]);
push(@finalfields,$errormessage);
push(@finalfields,$fields2[$#fields2]);

print Data::Dumper->Dump([@finalfields])."\n";

gives the following output:

$VAR1 = '2012-05-29';
$VAR2 = 'emaillocalpart';
$VAR3 = 'emaildomain';
$VAR4 = 'customerid';
$VAR5 = 'error@me@ssage';
$VAR6 = 'messageid';

Apologies - it's rather a verbose solution. You can also do the same in one regular expression:

$teststr=~/(.[^@]*)@(.[^@]*)@(.[^@]*)@(.[^@]*)@(.*)@(.[^@]*)/;
print "$1\n$2\n$3\n$4\n$5\n$6\n";

Upvotes: 4

Mike Thomsen
Mike Thomsen

Reputation: 37526

The easiest way to handle this would be to change @ to another, extremely less common delimiter like ;;;;

Upvotes: 0

Related Questions