Doug Hysell
Doug Hysell

Reputation: 11

Parse Unknown Length Text from file using Regular Expression REGEX

I am trying to extract text from a text file but the length of the text to collect varies in length. This is my first attemt at using RegEx and could use some sugestions

Here is the Source text. I am trying to extract.parse the Name, Email, Birthdat & Phone Number only. Any help would be appreciated.

Basic data
</td><td align="left" width="10" style="padding:0; margin:0;"> </td><td align="left" width="290" style="padding:0;"> </td></tr><tr><td align="right" width="250" style="padding-bottom:8px; margin:0; color: #555555; font-family: Arial, Helvetica, sans-serif; font-size:14px;">
Name:
</td><td align="left" width="10" style="padding:0; margin:0;"> </td><td align="left" width="290" style="color: #262626; padding-bottom:8px ; font-family: Arial, Helvetica, sans-serif; font-size:14px;">Test User3</td></tr><tr><td align="right" width="250" style="padding-bottom:8px; margin:0; color: #555555; font-family: Arial, Helvetica, sans-serif; font-size:14px;">
Email:
</td><td align="left" width="10" style="padding:0; margin:0;"> </td><td align="left" width="290" style="color: #262626; padding-bottom:8px ; font-family: Arial, Helvetica, sans-serif; font-size:14px;"><span style="color: #262626; text-decoration:none;">[email protected]</span></td></tr><tr><td align="center" colspan="3" height="20" width="100%" style="color: #262626; padding:0; margin:0; line-height:20px;"> </td></tr><tr><td align="right" width="250" style="padding-bottom:8px; margin:0; color: #002a5c; font-family: Arial, Helvetica, sans-serif; font-size:14px;">
Custom data
</td><td align="left" width="10" style="padding:0; margin:0;"> </td><td align="left" width="290" style="padding:0;"> </td></tr><tr><td align="right" width="250" style="padding-bottom:8px; margin:0; color: #555555; font-family: Arial, Helvetica, sans-serif; font-size:14px;">ref:
</td><td align="left" width="10" style="padding:0; margin:0;"> </td><td align="left" width="290" style="color: #262626; padding-bottom:8px ; font-family: Arial, Helvetica, sans-serif; font-size:14px;">06/16/1963</td></tr><tr><td align="right" width="250" style="padding-bottom:8px; margin:0; color: #555555; font-family: Arial, Helvetica, sans-serif; font-size:14px;">cellphone:
                                                            </td><td align="left" width="10" style="padding:0; margin:0;"> </td><td align="left" width="290" style="color: #262626; padding-bottom:8px ; font-family: Arial, Helvetica, sans-serif; font-size:14px;">6152498588</td></tr><tr><td align="center" colspan="3" height="20" width="100%" style="color: #262626; padding:0; margin:0; line-height:20px;"> </td></tr><tr><td align="right" width="250" style="padding-bottom:8px; margin:0; color: #002a5c; font-family: Arial, Helvetica, sans-serif; font-size:14px;">

Thanx in advance,

Doug

Upvotes: 0

Views: 153

Answers (2)

David Brabant
David Brabant

Reputation: 43559

Use the HTML Agility Pack instead. Parsing HTML with regex is a bad thing, except for very specific cases.

Upvotes: 2

Andreas Linden
Andreas Linden

Reputation: 12721

Better use SimpleXML instead of regex to parse HTML!

Upvotes: 0

Related Questions