AngelOfEffekt
AngelOfEffekt

Reputation: 3

Regex for matching mandatory and optional characters in random order

I need a regular expression to find Text between HTML-elements via the Visual Studion Search Engine (might by C#).

What works fine in a way is this:

>\s*([\w])+\s*<

But it has to match all the following "asdf"s:

<element>asdf
  <element>asdf.</element>asdf
  <element />
asdf asdf
</element>
<element>
  asdf!
</element>

What it should NOT find is an empty space between 2 tags, this example should match NOTHING:

<element>

  <element>  </element>
</element>

What I need in particular is a regex, that matches:

I don't want to get matches which includes special characters without \w.

Another, which doesn't work at all is this:

>\s*((?=[\w]+)(?=[ ?=()!"_]*))\s*<

What is the correct way to accomplish my need?

Thank you so much!

Upvotes: 0

Views: 1182

Answers (1)

CertainPerformance
CertainPerformance

Reputation: 370879

You can use one lookahead before matching the text between the ><s:

>(?=[^<]*\w).*?<

(use "s" flag, so dot matches newline - or, use something like [\S\s]*? instead of .*?)

The lookahead ensures that there's a word character between the > and the <. Then, match and lazy-repeat any character until you get to the <.

https://regex101.com/r/cqinyh/2

Upvotes: 1

Related Questions