wewereweb
wewereweb

Reputation: 199

Regular expression with multiple results

What's wrong with my regex ?

"/Blabla\(2\)&nbsp;:.*<tr><td class=\"generic\">(.*)<\/td>.+<\/tr>/Uis"

....

<tr>
<td class="aaa">Blabla(1)&nbsp;:</td>
<td>
<table class="bbb"><tbody>
<tr class="ccc"><th>title1</th><th>title2</th><th>title3</th></tr>
<tr><td class="generic">word1</td><td class="generic">word2 </td><td class="generic">word3</td></tr>
<tr><td class="generic">word4</td><td class="generic">word5 </td><td class="generic">word6</td></tr>
</tbody></table>
</td>
</tr>

<tr>
<td class="aaa">Blabla(2)&nbsp;:</td>
<td>
<table class="bbb"><tbody>
<tr class="ccc"><th>title1</th><th>title2</th><th>title3</th></tr>
<tr><td class="generic">word1b</td><td class="generic">word2b </td><td class="generic">word3b</td></tr>
<tr><td class="generic">word4b</td><td class="generic">word5b </td><td class="generic">word6b</td></tr>
</tbody></table>
</td>
</tr

What I want to do is to get the content of the FIRST TD of each TR from the block beginning with Blabla(2).

So the expected answer is word1b AND word4b But only the first is returned...

Thank you for your help. Please don't answer me to use a DOM navigator, it's not possible in my case.

Upvotes: 0

Views: 1341

Answers (2)

Darka
Darka

Reputation: 2768

Thanks to @Jerry, I learn today new tricks:

(Blabla\(2\)&nbsp;:.*?|\G)<tr><td class=\"generic\">\K([^<]+).+?<\/tr>\r\n

Upvotes: 0

Jerry
Jerry

Reputation: 71598

That's an interesting regex, in which I learned about the ungreedy flag, nice!

And for your problem, you might make use of \G to match immediately after the previous match and the flag g, assuming PCRE engine:

/(?:Blabla\(2\)&nbsp;:|(?<!^)\G).*<tr><td class=\"generic\">(.*)<\/td>.+<\/tr>/Uisg

regex101 demo

Or a little shorter with different delimiters:

'~(?:Blabla\(2\)&nbsp;:|(?<!^)\G).*<tr><td class="generic">(.*)</td>.+</tr>~Uisg'

Upvotes: 1

Related Questions