Reputation: 9297
I want to parse the line as this,
S1,F2 title including several white spaces (abbr) single,Here<->There,reply
And I want the output as below,
1
2
title including several white spaces
abbr
single
Here22There # identify <-> and translate it to 22;
reply
I am wondering how to parse the line above?
Method 1. I plan to split the whole line to four segments then parse the individual sub segments.
segment1. S1,F2
segment2. title including several white spaces
segment3. abbr
segment4. single,Here<->There,reply
Method 2. I just write a complex regular expression statement to parse it.
Which method is more make sense for my practice?
Appreciated on any comments or suggestions.
Upvotes: 0
Views: 189
Reputation: 8332
Regarding your first method, what you can do is like first split the string by comma,like
my $line =
'S1,F4 title including several white spaces (abbr) single,Here<->There,reply';
my ($field1, $field2, $field3, $field4) = split /,/, $line;
and then apply regex on the field containg substring S1
and F2 title including several white spaces (abbr) single
like
my ($field5) = $field1 =~ /S(\d+)/;
my ($field6, $field7, $field8, $field9) =
$field2 =~ m/^F(\d+)\s+(.*?)\((.*?)\)\s+(.*?)$/;
It will work for all these strings, and help to avoid using and making complex regular expressions,
S1,F2 title including several white spaces (abbr) single,Here<->There,reply
S1,F2 title including several white spaces (abbr) single,Here<->There
S1,F2 title including several white spaces (abbr) single,Here<->There,[reply]
Upvotes: 1
Reputation: 454912
Assuming your input be in the format specified you could use a regex like:
^S(\d+),F(\d+)\s+(.*?)\((.*?)\)\s+(.*?),(.*?),(.*)$
Upvotes: 2