Nano HE
Nano HE

Reputation: 9297

Question On Text Parsing In Perl

I want to parse the line as this,

S1,F2  title including several white spaces  (abbr) single,Here<->There,reply

And I want the output as below,

1
2
title including several white spaces
abbr
single
Here22There  # identify <-> and translate it to 22; 
reply

I am wondering how to parse the line above?

Method 1. I plan to split the whole line to four segments then parse the individual sub segments.

segment1. S1,F2

segment2. title including several white spaces

segment3. abbr

segment4. single,Here<->There,reply

Method 2. I just write a complex regular expression statement to parse it.

Which method is more make sense for my practice?

Appreciated on any comments or suggestions.

Upvotes: 0

Views: 189

Answers (2)

Nikhil Jain
Nikhil Jain

Reputation: 8332

Regarding your first method, what you can do is like first split the string by comma,like

my $line =
 'S1,F4  title including several white spaces (abbr) single,Here<->There,reply';
 my ($field1, $field2, $field3, $field4) = split /,/, $line;

and then apply regex on the field containg substring S1 and F2 title including several white spaces (abbr) single like

my ($field5) = $field1 =~ /S(\d+)/;
my ($field6, $field7, $field8, $field9) = 
                    $field2 =~ m/^F(\d+)\s+(.*?)\((.*?)\)\s+(.*?)$/;

It will work for all these strings, and help to avoid using and making complex regular expressions,

S1,F2  title including several white spaces  (abbr) single,Here<->There,reply
S1,F2  title including several white spaces  (abbr) single,Here<->There
S1,F2  title including several white spaces  (abbr) single,Here<->There,[reply]

Upvotes: 1

codaddict
codaddict

Reputation: 454912

Assuming your input be in the format specified you could use a regex like:

^S(\d+),F(\d+)\s+(.*?)\((.*?)\)\s+(.*?),(.*?),(.*)$

Codepad link

Upvotes: 2

Related Questions