Melignus
Melignus

Reputation: 149

Regex to extract a block of text between matches

Alright, so here's my problem. I'm trying to write a script in PHP that will parse our workorder system and return a set of tickets but I've run into a bit of a snag trying to parse the ticket list. I've been trying to use regex as much as possible to force myself to learn the syntax and I could sware that this should work but alas, it's not and so I come here seeking Your collective wisdom.

<tr>
   ...
   ...
   ...
   ...
</tr>

I am trying to retrieve the block between the tags here so that I can parse that down again for specific information. The block size is pretty regular but lines between the tags might vary based on the length of the description in the ticket. The regex that I'm currently employing is

/<tr>(.+)<\/tr>/

This seems the smallest way to achieve my goal but I am getting errors from preg_match. I realize I could flag and loop it as in this very very rough pseudo code

if /<tr>/ then {
   while != /<\/tr>/ {
      store line
   }
}

however my goal here is to gain a better understanding of regex and how to use it.

Upvotes: 0

Views: 782

Answers (2)

racerror
racerror

Reputation: 1629

Use Simple HTML DOM.

Regex parsing html is a mess.

Upvotes: 1

Sjoerd
Sjoerd

Reputation: 75588

  • Maybe you need the s (PCRE_DOTALL) modifier, to match over multiple lines.
  • Maybe you want .*? instead of .*, or the U (PCRE_UNGREEDY) modifier to match non-greedy.

Upvotes: 2

Related Questions