Conner Hasbrouck
Conner Hasbrouck

Reputation: 21

Get a string with HTML tags inside a larger string with ColdFusion regex

I'm new to regular expressions and could use some help.

I am attempting to use a ColdFusion REReplace to scrape data and get my desired content.

This is what I have so far:

<cfoutput>
#REReplace("Remove this please <p>Make this Display Please</p> Remove this please", "", "", "All")#
</cfoutput>

What regular expression could take that string and return only "Make this Display Please"?

Upvotes: 0

Views: 172

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626926

In order to get a subtext from a longer string, you need to match everything up to what you need, capture what you need with a capturing group (...), and then match the rest of the string up to the end. The replacement is \1 back-reference that references the text captured by the capturing group.

So, use

#REReplace("Remove this please <p>Make this Display Please</p> Remove this please", ".*<p>(.*?)</p>.*", "\1", "All")#

The regex matches:

  • .* - matches any character but a newline from the beginning up to the last </p>
  • <p> - the literal <p>
  • (.*?) - 0 or more characters other than newline symbol as few as possible (it means up to the closest </p> here)
  • </p> - matches literal </p>
  • .* - matches the rest of text to the end (no newlines).

To match newlines, use [\s\S] instead of ..

Upvotes: 2

Related Questions